Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-5299

PartitionFilterOptimizer failing at compile time

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.17.1
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      Following (rather simple) code

      test.pig
      A = LOAD '/tmp/testinput' using org.apache.pig.test.TestLoader ('srcid:int, mrkt:chararray, dstid:int, name:chararray', 'srcid'); --srcid is the partition-key
      B= filter A by dstid != 10 OR ((dstid < 3000 and srcid == 1000) OR (dstid >= 3000 and srcid == 2000));
      dump B;
      

      is failing with

      2017-09-07 16:37:03,210 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2243: Attempt to remove operator GreaterThanEqual that is still connected in the plan

        Activity

        Hide
        knoguchi Koji Noguchi added a comment -

        Log file showing the trace.

        Pig Stack Trace
        ---------------
        ERROR 2243: Attempt to remove operator GreaterThanEqual that is still connected in the plan
        
        org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to open iterator for alias B
                at org.apache.pig.PigServer.openIterator(PigServer.java:1020)
                at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:782)
                at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:383)
                at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:230)
                at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:205)
                at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:81)
                at org.apache.pig.Main.run(Main.java:630)
                at org.apache.pig.Main.main(Main.java:175)
        Caused by: org.apache.pig.PigException: ERROR 1002: Unable to store alias B
                at org.apache.pig.PigServer.storeEx(PigServer.java:1123)
                at org.apache.pig.PigServer.store(PigServer.java:1082)
                at org.apache.pig.PigServer.openIterator(PigServer.java:995)
                ... 7 more
        Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 2000: Error processing rule PartitionFilterOptimizer. Try -t PartitionFilterOptimizer
                at org.apache.pig.newplan.optimizer.PlanOptimizer.optimize(PlanOptimizer.java:125)
                at org.apache.pig.newplan.logical.relational.LogicalPlan.optimize(LogicalPlan.java:281)
                at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1458)
                at org.apache.pig.PigServer.storeEx(PigServer.java:1119)
                ... 9 more
        Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 2243: Attempt to remove operator GreaterThanEqual that is still connected in the plan
                at org.apache.pig.newplan.BaseOperatorPlan.remove(BaseOperatorPlan.java:172)
                at org.apache.pig.newplan.FilterExtractor.removeFromFilteredPlan(FilterExtractor.java:340)
                at org.apache.pig.newplan.FilterExtractor.removeFromFilteredPlan(FilterExtractor.java:337)
                at org.apache.pig.newplan.FilterExtractor.removeFromFilteredPlan(FilterExtractor.java:337)
                at org.apache.pig.newplan.FilterExtractor.checkPushDown(FilterExtractor.java:255)
                at org.apache.pig.newplan.FilterExtractor.visit(FilterExtractor.java:106)
                at org.apache.pig.newplan.logical.rules.PartitionFilterOptimizer$PartitionFilterPushDownTransformer.transform(PartitionFilterOptimizer.java:155)
                at org.apache.pig.newplan.optimizer.PlanOptimizer.optimize(PlanOptimizer.java:110)
                ... 12 more
        

        Basically when PartitionFilterOptimizer gives up on pushing down certain filter condition, updating the filteredPlan is failing.

        Show
        knoguchi Koji Noguchi added a comment - Log file showing the trace. Pig Stack Trace --------------- ERROR 2243: Attempt to remove operator GreaterThanEqual that is still connected in the plan org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to open iterator for alias B at org.apache.pig.PigServer.openIterator(PigServer.java:1020) at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:782) at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:383) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:230) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:205) at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:81) at org.apache.pig.Main.run(Main.java:630) at org.apache.pig.Main.main(Main.java:175) Caused by: org.apache.pig.PigException: ERROR 1002: Unable to store alias B at org.apache.pig.PigServer.storeEx(PigServer.java:1123) at org.apache.pig.PigServer.store(PigServer.java:1082) at org.apache.pig.PigServer.openIterator(PigServer.java:995) ... 7 more Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 2000: Error processing rule PartitionFilterOptimizer. Try -t PartitionFilterOptimizer at org.apache.pig.newplan.optimizer.PlanOptimizer.optimize(PlanOptimizer.java:125) at org.apache.pig.newplan.logical.relational.LogicalPlan.optimize(LogicalPlan.java:281) at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1458) at org.apache.pig.PigServer.storeEx(PigServer.java:1119) ... 9 more Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 2243: Attempt to remove operator GreaterThanEqual that is still connected in the plan at org.apache.pig.newplan.BaseOperatorPlan.remove(BaseOperatorPlan.java:172) at org.apache.pig.newplan.FilterExtractor.removeFromFilteredPlan(FilterExtractor.java:340) at org.apache.pig.newplan.FilterExtractor.removeFromFilteredPlan(FilterExtractor.java:337) at org.apache.pig.newplan.FilterExtractor.removeFromFilteredPlan(FilterExtractor.java:337) at org.apache.pig.newplan.FilterExtractor.checkPushDown(FilterExtractor.java:255) at org.apache.pig.newplan.FilterExtractor.visit(FilterExtractor.java:106) at org.apache.pig.newplan.logical.rules.PartitionFilterOptimizer$PartitionFilterPushDownTransformer.transform(PartitionFilterOptimizer.java:155) at org.apache.pig.newplan.optimizer.PlanOptimizer.optimize(PlanOptimizer.java:110) ... 12 more Basically when PartitionFilterOptimizer gives up on pushing down certain filter condition, updating the filteredPlan is failing.
        Hide
        knoguchi Koji Noguchi added a comment -

        Pasting filter plan with hashcode (to check the uniqueness)

        |---(Name: Constant Type: null Uid: null):1947185929
        (Name: And Type: null Uid: null):1774720883
        |
        |---(Name: Or Type: null Uid: null):1605851606
        |   |
        |   |---(Name: Equal Type: null Uid: null):1534754611
        |   |   |
        |   |   |---(Name: Project Type: null Uid: null Input: 0 Column: 0):944140566
        |   |   |
        |   |   |---(Name: Constant Type: null Uid: null):1551446957
        |   |
        |   |---(Name: GreaterThanEqual Type: null Uid: null):1020154737
        |       |
        |       |---(Name: Project Type: null Uid: null Input: 0 Column: 2):987249254
        |       |
        |       |---(Name: Constant Type: null Uid: null):1850954068
        |
        |---(Name: And Type: null Uid: null):1440621772
            |
            |---(Name: Or Type: null Uid: null):352083716
            |   |
            |   |---(Name: LessThan Type: null Uid: null):713312506
            |   |   |
            |   |   |---(Name: Project Type: null Uid: null Input: 0 Column: 2):597307515
            |   |   |
            |   |   |---(Name: Constant Type: null Uid: null):770010802
            |   |
            |   |---(Name: Equal Type: null Uid: null):223693919
            |       |
            |       |---(Name: Project Type: null Uid: null Input: 0 Column: 0):1758056825
            |       |
            |       |---(Name: Constant Type: null Uid: null):361268035
            |
            |---(Name: Or Type: null Uid: null):1787189503
                |
                |---(Name: LessThan Type: null Uid: null):713312506
                |   |
                |   |---(Name: Project Type: null Uid: null Input: 0 Column: 2):597307515
                |   |
                |   |---(Name: Constant Type: null Uid: null):770010802
                |
                |---(Name: GreaterThanEqual Type: null Uid: null):1020154737
                    |
                    |---(Name: Project Type: null Uid: null Input: 0 Column: 2):987249254
                    |
                    |---(Name: Constant Type: null Uid: null):1850954068
        

        This |---(Name: GreaterThanEqual Type: null Uid: null):1020154737 is used twice in the plan.
        One from
        |---(Name: Or Type: null Uid: null):1605851606
        and another from
        |---(Name: Or Type: null Uid: null):1787189503

        Error message

        Attempt to remove operator GreaterThanEqual that is still connected in the plan

        is coming when FilterExtractor tries to remove GreaterThanEqual from the first OR:1605851606 but failing when GreaterThanEqual is still connected to OR:1787189503.

        Show
        knoguchi Koji Noguchi added a comment - Pasting filter plan with hashcode (to check the uniqueness) |---(Name: Constant Type: null Uid: null):1947185929 (Name: And Type: null Uid: null):1774720883 | |---(Name: Or Type: null Uid: null):1605851606 | | | |---(Name: Equal Type: null Uid: null):1534754611 | | | | | |---(Name: Project Type: null Uid: null Input: 0 Column: 0):944140566 | | | | | |---(Name: Constant Type: null Uid: null):1551446957 | | | |---(Name: GreaterThanEqual Type: null Uid: null):1020154737 | | | |---(Name: Project Type: null Uid: null Input: 0 Column: 2):987249254 | | | |---(Name: Constant Type: null Uid: null):1850954068 | |---(Name: And Type: null Uid: null):1440621772 | |---(Name: Or Type: null Uid: null):352083716 | | | |---(Name: LessThan Type: null Uid: null):713312506 | | | | | |---(Name: Project Type: null Uid: null Input: 0 Column: 2):597307515 | | | | | |---(Name: Constant Type: null Uid: null):770010802 | | | |---(Name: Equal Type: null Uid: null):223693919 | | | |---(Name: Project Type: null Uid: null Input: 0 Column: 0):1758056825 | | | |---(Name: Constant Type: null Uid: null):361268035 | |---(Name: Or Type: null Uid: null):1787189503 | |---(Name: LessThan Type: null Uid: null):713312506 | | | |---(Name: Project Type: null Uid: null Input: 0 Column: 2):597307515 | | | |---(Name: Constant Type: null Uid: null):770010802 | |---(Name: GreaterThanEqual Type: null Uid: null):1020154737 | |---(Name: Project Type: null Uid: null Input: 0 Column: 2):987249254 | |---(Name: Constant Type: null Uid: null):1850954068 This |---(Name: GreaterThanEqual Type: null Uid: null):1020154737 is used twice in the plan. One from |---(Name: Or Type: null Uid: null):1605851606 and another from |---(Name: Or Type: null Uid: null):1787189503 Error message Attempt to remove operator GreaterThanEqual that is still connected in the plan is coming when FilterExtractor tries to remove GreaterThanEqual from the first OR:1605851606 but failing when GreaterThanEqual is still connected to OR:1787189503.
        Hide
        knoguchi Koji Noguchi added a comment -

        Two ways of handling this.
        (1) Deepcopy the expressions everywhere so that same expression is not used in multiple locations.
        Or
        (2) Delay the removal for expression that is used elsewhere.

        I'll try getting a patch with (2) and see how it looks.

        Show
        knoguchi Koji Noguchi added a comment - Two ways of handling this. (1) Deepcopy the expressions everywhere so that same expression is not used in multiple locations. Or (2) Delay the removal for expression that is used elsewhere. I'll try getting a patch with (2) and see how it looks.
        Hide
        knoguchi Koji Noguchi added a comment -

        (2) Delay the removal for expression that is used elsewhere.

        Attaching a patch and running a full test now.

        Show
        knoguchi Koji Noguchi added a comment - (2) Delay the removal for expression that is used elsewhere. Attaching a patch and running a full test now.
        Hide
        knoguchi Koji Noguchi added a comment -

        Attaching a patch and running a full test now.

        Tests look good. Making it patch-available.

        Show
        knoguchi Koji Noguchi added a comment - Attaching a patch and running a full test now. Tests look good. Making it patch-available.
        Hide
        rohini Rohini Palaniswamy added a comment -

        +1

        Show
        rohini Rohini Palaniswamy added a comment - +1
        Hide
        knoguchi Koji Noguchi added a comment -

        Thanks for the review Rohini!

        Committed to branch 0.17 and trunk.

        Show
        knoguchi Koji Noguchi added a comment - Thanks for the review Rohini! Committed to branch 0.17 and trunk.

          People

          • Assignee:
            knoguchi Koji Noguchi
            Reporter:
            knoguchi Koji Noguchi
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development