Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
0.12.0, 0.13.0
-
None
-
None
-
None
Description
test1.txt
{(,A1111,A),(,B222,B),(,C333,C)}
test2.txt
A Helloworld B Pig C Hive
tt.pig
A = LOAD 'test1.txt' AS (mybag:bag{t:tuple(title:chararray,name:chararray, id:chararray)}); A1 = FOREACH A generate flatten(mybag) as (title:chararray,name:chararray, id:chararray); B = LOAD 'test2.txt' AS (id:chararray, content:chararray); C = JOIN A1 BY id LEFT OUTER, B BY id; dump C;
$ pig -x local -f tt.pig
....
java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
at java.util.ArrayList.rangeCheck(ArrayList.java:604)
at java.util.ArrayList.get(ArrayList.java:382)
at org.apache.pig.data.DefaultTuple.get(DefaultTuple.java:115)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POPackage.getValueTuple(POPackage.java:350)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POPackage.getNext(POPackage.java:273)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.processOnePackageOutput(PigGenericMapReduce.java:425)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.reduce(PigGenericMapReduce.java:416)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.reduce(PigGenericMapReduce.java:256)
at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:170)
at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:636)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:396)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:441)
[main] WARN org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Ooops! Some job has failed! Specify -stop_on_failu
re if you want Pig to stop immediately on failure.
==============
If we change the code into this one:
tt.pig
A = LOAD 'test1.txt' AS (mybag:bag{t:tuple(title:chararray,name:chararray, id:chararray)}); A1 = FOREACH A generate flatten(mybag) as (title:chararray,name:chararray, id:chararray); A1 = FOREACH A1 generate title,name,id; B = LOAD 'test2.txt' AS (id:chararray, content:chararray); C = JOIN A1 BY id LEFT OUTER, B BY id; dump C;
The job succeed, and here is the result of execution.
======== (,A1111,A,A,Helloworld) (,B222,B,B,Pig) (,C333,C,C,Hive) (,,,,) ========
Attachments
Issue Links
- depends upon
-
PIG-2315 Make as clause work in generate
- Closed