Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-627

PERFORMANCE: multi-query optimization

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.2.0
    • 0.3.0
    • None
    • None

    Description

      Currently, if your Pig script contains multiple stores and some shared computation, Pig will execute several independent queries. For instance:

      A = load 'data' as (a, b, c);
      B = filter A by a > 5;
      store B into 'output1';
      C = group B by b;
      store C into 'output2';

      This script will result in map-only job that generated output1 followed by a map-reduce job that generated output2. As the resuld data is read, parsed and filetered twice which is unnecessary and costly.

      Attachments

        1. streaming-fix.patch
          10 kB
          Gunther Hagleitner
        2. noop_filter_absolute_path_flag.patch
          88 kB
          Gunther Hagleitner
        3. noop_filter_absolute_path_flag_0401.patch
          125 kB
          Gunther Hagleitner
        4. non_reversible_store_load_dependencies.patch
          76 kB
          Gunther Hagleitner
        5. non_reversible_store_load_dependencies_2.patch
          90 kB
          Gunther Hagleitner
        6. multi-store-0304.patch
          78 kB
          Gunther Hagleitner
        7. multi-store-0303.patch
          77 kB
          Gunther Hagleitner
        8. multiquery-phase3_0423.patch
          77 kB
          Richard Ding
        9. multiquery-phase2_0323.patch
          88 kB
          Richard Ding
        10. multiquery-phase2_0313.patch
          86 kB
          Richard Ding
        11. multiquery_explain_fix.patch
          3 kB
          Gunther Hagleitner
        12. multiquery_0306.patch
          32 kB
          Richard Ding
        13. multiquery_0224.patch
          146 kB
          Gunther Hagleitner
        14. multiquery_0223.patch
          110 kB
          Gunther Hagleitner
        15. merge-041409.patch
          21 kB
          Gunther Hagleitner
        16. merge_trunk_to_branch.patch
          13 kB
          Gunther Hagleitner
        17. merge_741727_HEAD__0324.patch
          591 kB
          Gunther Hagleitner
        18. merge_741727_HEAD__0324_2.patch
          595 kB
          Gunther Hagleitner
        19. fix_store_prob.patch
          26 kB
          Gunther Hagleitner
        20. file_cmds-0305.patch
          33 kB
          Gunther Hagleitner
        21. error_handling_0416.patch
          27 kB
          Gunther Hagleitner
        22. error_handling_0415.patch
          27 kB
          Gunther Hagleitner
        23. doc-fix.patch
          5 kB
          Gunther Hagleitner

        Activity

          People

            hagleitn Gunther Hagleitner
            olgan Olga Natkovich
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: