Pig
  1. Pig
  2. PIG-946

Combiner optimizer does not optimize when limit follow group, foreach

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.3.0
    • Fix Version/s: 0.9.0
    • Component/s: None
    • Labels:
      None

      Description

      The following script is combinable but is not optimized:
      a = load '/user/pig/tests/data/singlefile/studenttab10k';
      b = group a by $1;
      c = foreach b generate group, AVG(a.$2);
      d = limit c 10;
      dump d;

      1. PIG-946-codechange-draft.patch
        1 kB
        Pradeep Kamath
      2. PIG-946.1.patch
        5 kB
        Thejas M Nair

        Activity

        Hide
        Thejas M Nair added a comment -

        Patch committed to trunk.

        Show
        Thejas M Nair added a comment - Patch committed to trunk.
        Hide
        Daniel Dai added a comment -

        +1

        Show
        Daniel Dai added a comment - +1
        Hide
        Thejas M Nair added a comment -

        Test passes test-patch and unit tests.
        [exec] +1 overall.
        [exec]
        [exec] +1 @author. The patch does not contain any @author tags.
        [exec]
        [exec] +1 tests included. The patch appears to include 3 new or modified tests.
        [exec]
        [exec] +1 javadoc. The javadoc tool did not generate any warning messages.
        [exec]
        [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings.
        [exec]
        [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings.
        [exec]
        [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings.

        Show
        Thejas M Nair added a comment - Test passes test-patch and unit tests. [exec] +1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 3 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings.
        Hide
        Pradeep Kamath added a comment -

        Attached patch has a first draft of the code changes required in CombinerOptiimizer.java. This is a non tested draft patch - needs to be augmented with testcases - attaching it merely for reference - should not be committed yet.

        Show
        Pradeep Kamath added a comment - Attached patch has a first draft of the code changes required in CombinerOptiimizer.java. This is a non tested draft patch - needs to be augmented with testcases - attaching it merely for reference - should not be committed yet.
        Hide
        Pradeep Kamath added a comment -

        The root cause is in the CombinerOptimizer, the code expects POPackage to have POForEach as the successor and then analyzes the POForEach to check if it can be combined. In the above query, the OpLimitOptimizer pushes the limit between the cogroup and foreach which in the MRPlan shows up as a POLimit between the POPackage and POForEach. In this case, the CombinerOptimizer should ignore the POLimit and still analyze the POForEach to check if combiner optimization is possible.

        Show
        Pradeep Kamath added a comment - The root cause is in the CombinerOptimizer, the code expects POPackage to have POForEach as the successor and then analyzes the POForEach to check if it can be combined. In the above query, the OpLimitOptimizer pushes the limit between the cogroup and foreach which in the MRPlan shows up as a POLimit between the POPackage and POForEach. In this case, the CombinerOptimizer should ignore the POLimit and still analyze the POForEach to check if combiner optimization is possible.

          People

          • Assignee:
            Thejas M Nair
            Reporter:
            Pradeep Kamath
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development