Description
The following script doesn't work:
A = load 'file:/etc/passwd' using PigStorage(':');
split A into A1 if $2<25, A2 if $2>25;
dump A1;
The error dump is:
2008-08-08 17:03:06,888 [main] WARN org.apache.pig.PigServer - bytearray is implicitly casted to integer under LOGreaterThan Operator
2008-08-08 17:03:06,889 [main] WARN org.apache.pig.PigServer - bytearray is implicitly casted to integer under LOLesserThan Operator
2008-08-08 17:03:06,970 [main] ERROR org.apache.pig.impl.plan.OperatorPlan - Attempt to give operator of type org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POStore multiple inputs. This operator does not support multiple inputs.
2008-08-08 17:03:06,973 [main] ERROR org.apache.pig.tools.grunt.GruntParser - java.io.IOException: Unable to open iterator for alias: A1 [Attempt to give operator of type org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POStore multiple inputs. This operator does not support multiple inputs.]
at org.apache.pig.backend.local.executionengine.LocalExecutionEngine.execute(LocalExecutionEngine.java:158)
at org.apache.pig.PigServer.execute(PigServer.java:519)
at org.apache.pig.PigServer.openIterator(PigServer.java:307)
at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:258)
at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:175)
at org.apache.pig.tools.grunt.GruntParser.parseContOnError(GruntParser.java:92)
at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:58)
at org.apache.pig.Main.main(Main.java:278)
Caused by: org.apache.pig.backend.executionengine.ExecException: Attempt to give operator of type org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POStore multiple inputs. This operator does not support multiple inputs.
... 8 more
Caused by: org.apache.pig.impl.plan.PlanException: Attempt to give operator of type org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POStore multiple inputs. This operator does not support multiple inputs.
at org.apache.pig.impl.plan.OperatorPlan.connect(OperatorPlan.java:169)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.plans.PhysicalPlan.connect(PhysicalPlan.java:81)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.plans.PhysicalPlan.connect(PhysicalPlan.java:1)
at org.apache.pig.impl.plan.OperatorPlan.addAsLeaf(OperatorPlan.java:395)
at org.apache.pig.backend.local.executionengine.LocalExecutionEngine.execute(LocalExecutionEngine.java:142)
... 7 more
2008-08-08 17:03:06,973 [main] ERROR org.apache.pig.tools.grunt.GruntParser - Unable to open iterator for alias: A1 [Attempt to give operator of type org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POStore multiple inputs. This operator does not support multiple inputs.]
2008-08-08 17:03:06,973 [main] ERROR org.apache.pig.tools.grunt.GruntParser - java.io.IOException: Unable to open iterator for alias: A1 [Attempt to give operator of type org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POStore multiple inputs. This operator does not support multiple inputs.]
The issue is that PigServer.OpenIterator doesn't send the logical plan corresponding to A1 but sends the plan with both splitOutputs. I have a fix for this and will attach it. We have two parallel branches which do nearly the same thing. OpenIterator & Store. However they have individual code paths and we see the same script with a store not raise a plan exception. For now I will just copy some code. But we need to fix this in a better way. Otherwise we will leave some bugs that we fix for store in dump.