Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
0.9.0
-
None
-
None
-
None
-
Unlinking from 0.11 for the same reason
Description
With Pig0.9, if a relation and a column has the same name and if the column is used in a nested foreach, the script execution fails
while compiling.
The below is the trace;
java.lang.NullPointerException at org.apache.pig.newplan.logical.visitor.ScalarVisitor$1.visit(ScalarVisitor.java:63) at org.apache.pig.newplan.logical.expression.ScalarExpression.accept(ScalarExpression.java:109) at org.apache.pig.newplan.DependencyOrderWalker.walk(DependencyOrderWalker.java:75) at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:50) at org.apache.pig.newplan.logical.optimizer.AllExpressionVisitor.visit(AllExpressionVisitor.java:142) at org.apache.pig.newplan.logical.relational.LOSort.accept(LOSort.java:119) at org.apache.pig.newplan.DependencyOrderWalker.walk(DependencyOrderWalker.java:75) at org.apache.pig.newplan.logical.optimizer.AllExpressionVisitor.visit(AllExpressionVisitor.java:104) at org.apache.pig.newplan.logical.relational.LOForEach.accept(LOForEach.java:74) at org.apache.pig.newplan.DependencyOrderWalker.walk(DependencyOrderWalker.java:75) at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:50) at org.apache.pig.PigServer$Graph.compile(PigServer.java:1674) at org.apache.pig.PigServer$Graph.compile(PigServer.java:1666) at org.apache.pig.PigServer$Graph.access$200(PigServer.java:1391) at org.apache.pig.PigServer.execute(PigServer.java:1293) at org.apache.pig.PigServer.executeBatch(PigServer.java:359) at org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:131) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:192) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:164) at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:81) at org.apache.pig.Main.run(Main.java:553) at org.apache.pig.Main.main(Main.java:108)
This could be reproduced with the below script
f3 = load 'input.txt' as (a1:chararray); A = load '3char_1long_tab' as (f1:chararray, f2:chararray, f3:chararray,ct:long); B = GROUP A BY f1; C = FOREACH B { zip_ordered = ORDER A BY f3 ASC; GENERATE FLATTEN(group) AS f1, A.(f3, ct), COUNT(zip_ordered), SUM(A.ct) AS total; }; STORE C INTO 'deletemeanytimeplease';
Checked with a unit test in trunk, the behavior is still same.