Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-2009

Better MergeForEach rule

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 0.9.0
    • None
    • None
    • None

    Description

      MergeForEach rule will not merge two consecutive ForEach if the second ForEach has inner relational plan. This prevent some optimizations. Eg,

      A = LOAD 'input1' AS (a0, a1, a2);
      B = LOAD 'input2' AS (b0, b1, b2);
      C = cogroup A by a0, B by b0;
      D = foreach C { E = limit A 10; F = E.a1; G = DISTINCT F; generate group, COUNT(G);};
      explain D;
      

      We add ForEach after cogroup to prune B, however, we cannot merge this ForEach with D. Secondary key optimization for this query is thus disabled.

      Attachments

        Activity

          People

            daijy Daniel Dai
            daijy Daniel Dai
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated: