Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-146

More meaningful error message could be used when a non-existing column is accessed

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • None
    • 0.2.0
    • None
    • None

    Description

      When accessing a non-existing column after getting the columns with a streaming command, I got the following error which is not quite meaningfule:

      [main] ERROR org.apache.pig.tools.grunt.Grunt -
      

      Here is the sample Pig script that I used. The data file has only 3 tab seperate fields so the streaming command on line 3 generates tuples with 3 columns. On line 4 the script tries to access the 4th column which does not exist and thus the error above occurs.

      A = load 'data;
      B = foreach A generate $2, $1, $0;
      C = stream B through `awk 'BEGIN {FS = "\t"; OFS = "\t"} {print $3, $2, $1}'`;
      D = foreach C generate $4;
      store D into 'results';
      

      Here is what happens on my machine:

      grunt> A = load 'data';
      grunt> B = foreach A generate $2, $1, $0;
      grunt> stream B through `awk 'BEGIN {FS = "\t"; OFS = "\t"} {print $3, $2, $1}'`;
      grunt> D = foreach C generate $4;
      2008-03-11 18:53:00,376 [main] ERROR org.apache.pig.tools.grunt.GruntParser - 
      grunt>
      

      A related note is that Pig behaves differently if there is no streaming command in the Pig script. In this case, an IndexOutOfBoundsException exception is generated at runtime.

      grunt> A = load 'data';
      grunt> B = foreach A generate $2, $1, $0;
      grunt>  D = foreach B generate $4;                    
      grunt> store D into 'results';
      2008-03-11 18:57:40,107 [main] INFO  org.apache.pig.backend.hadoop.executionengine.POMapreduce - ----- MapReduce Job -----
      2008-03-11 18:57:40,108 [main] INFO  org.apache.pig.backend.hadoop.executionengine.POMapreduce - Input: [data:org.apache.pig.builtin.PigStorage()]
      2008-03-11 18:57:40,109 [main] INFO  org.apache.pig.backend.hadoop.executionengine.POMapreduce - Map: [[*]->GENERATE {[PROJECT $2],[PROJECT $1],[PROJECT $0]}->GENERATE {[PROJECT $4]}]
      2008-03-11 18:57:40,109 [main] INFO  org.apache.pig.backend.hadoop.executionengine.POMapreduce - Group: null
      2008-03-11 18:57:40,110 [main] INFO  org.apache.pig.backend.hadoop.executionengine.POMapreduce - Combine: null
      2008-03-11 18:57:40,110 [main] INFO  org.apache.pig.backend.hadoop.executionengine.POMapreduce - Reduce: null
      2008-03-11 18:57:40,111 [main] INFO  org.apache.pig.backend.hadoop.executionengine.POMapreduce - Output: results:org.apache.pig.builtin.PigStorage
      2008-03-11 18:57:40,111 [main] INFO  org.apache.pig.backend.hadoop.executionengine.POMapreduce - Split: null
      2008-03-11 18:57:40,112 [main] INFO  org.apache.pig.backend.hadoop.executionengine.POMapreduce - Map parallelism: -1
      2008-03-11 18:57:40,112 [main] INFO  org.apache.pig.backend.hadoop.executionengine.POMapreduce - Reduce parallelism: -1
      2008-03-11 18:57:42,391 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher - Pig progress = 0%
      2008-03-11 18:57:59,466 [main] ERROR org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher - Error message from task (map) tip_200802211201_1494_m_000000 java.lang.IndexOutOfBoundsException: Requested index 4 from tuple (2.93, 21, rachel ovid)
              at org.apache.pig.data.Tuple.getField(Tuple.java:153)
              at org.apache.pig.impl.eval.ProjectSpec.eval(ProjectSpec.java:84)
              at org.apache.pig.impl.eval.SimpleEvalSpec$1.add(SimpleEvalSpec.java:35)
              at org.apache.pig.impl.eval.GenerateSpec$CrossProductItem.exec(GenerateSpec.java:261)
              at org.apache.pig.impl.eval.GenerateSpec$1.add(GenerateSpec.java:86)
              at org.apache.pig.impl.eval.GenerateSpec$CrossProductItem.add(GenerateSpec.java:230)
              at org.apache.pig.impl.eval.collector.UnflattenCollector.add(UnflattenCollector.java:52)
              at org.apache.pig.impl.eval.collector.DataCollector.addToSuccessor(DataCollector.java:93)
              at org.apache.pig.impl.eval.SimpleEvalSpec$1.add(SimpleEvalSpec.java:35)
              at org.apache.pig.impl.eval.GenerateSpec$CrossProductItem.exec(GenerateSpec.java:261)
              at org.apache.pig.impl.eval.GenerateSpec$1.add(GenerateSpec.java:86)
              at org.apache.pig.backend.hadoop.executionengine.mapreduceExec.PigMapReduce.run(PigMapReduce.java:113)
              at org.apache.hadoop.mapred.MapTask.run(MapTask.java:192)
              at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1760)
       java.lang.IndexOutOfBoundsException: Requested index 4 from tuple (2.93, 21, rachel ovid)
              at org.apache.pig.data.Tuple.getField(Tuple.java:153)
              at org.apache.pig.impl.eval.ProjectSpec.eval(ProjectSpec.java:84)
              at org.apache.pig.impl.eval.SimpleEvalSpec$1.add(SimpleEvalSpec.java:35)
              at org.apache.pig.impl.eval.GenerateSpec$CrossProductItem.exec(GenerateSpec.java:261)
              at org.apache.pig.impl.eval.GenerateSpec$1.add(GenerateSpec.java:86)
              at org.apache.pig.impl.eval.GenerateSpec$CrossProductItem.add(GenerateSpec.java:230)
              at org.apache.pig.impl.eval.collector.UnflattenCollector.add(UnflattenCollector.java:52)
              at org.apache.pig.impl.eval.collector.DataCollector.addToSuccessor(DataCollector.java:93)
              at org.apache.pig.impl.eval.SimpleEvalSpec$1.add(SimpleEvalSpec.java:35)
              at org.apache.pig.impl.eval.GenerateSpec$CrossProductItem.exec(GenerateSpec.java:261)
              at org.apache.pig.impl.eval.GenerateSpec$1.add(GenerateSpec.java:86)
              at org.apache.pig.backend.hadoop.executionengine.mapreduceExec.PigMapReduce.run(PigMapReduce.java:113)
              at org.apache.hadoop.mapred.MapTask.run(MapTask.java:192)
              at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1760)
       java.lang.IndexOutOfBoundsException: Requested index 4 from tuple (2.93, 21, rachel ovid)
              at org.apache.pig.data.Tuple.getField(Tuple.java:153)
              at org.apache.pig.impl.eval.ProjectSpec.eval(ProjectSpec.java:84)
              at org.apache.pig.impl.eval.SimpleEvalSpec$1.add(SimpleEvalSpec.java:35)
              at org.apache.pig.impl.eval.GenerateSpec$CrossProductItem.exec(GenerateSpec.java:261)
              at org.apache.pig.impl.eval.GenerateSpec$1.add(GenerateSpec.java:86)
              at org.apache.pig.impl.eval.GenerateSpec$CrossProductItem.add(GenerateSpec.java:230)
              at org.apache.pig.impl.eval.collector.UnflattenCollector.add(UnflattenCollector.java:52)
              at org.apache.pig.impl.eval.collector.DataCollector.addToSuccessor(DataCollector.java:93)
              at org.apache.pig.impl.eval.SimpleEvalSpec$1.add(SimpleEvalSpec.java:35)
              at org.apache.pig.impl.eval.GenerateSpec$CrossProductItem.exec(GenerateSpec.java:261)
              at org.apache.pig.impl.eval.GenerateSpec$1.add(GenerateSpec.java:86)
              at org.apache.pig.backend.hadoop.executionengine.mapreduceExec.PigMapReduce.run(PigMapReduce.java:113)
              at org.apache.hadoop.mapred.MapTask.run(MapTask.java:192)
              at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1760)
       java.lang.IndexOutOfBoundsException: Requested index 4 from tuple (2.93, 21, rachel ovid)
              at org.apache.pig.data.Tuple.getField(Tuple.java:153)
              at org.apache.pig.impl.eval.ProjectSpec.eval(ProjectSpec.java:84)
              at org.apache.pig.impl.eval.SimpleEvalSpec$1.add(SimpleEvalSpec.java:35)
              at org.apache.pig.impl.eval.GenerateSpec$CrossProductItem.exec(GenerateSpec.java:261)
              at org.apache.pig.impl.eval.GenerateSpec$1.add(GenerateSpec.java:86)
              at org.apache.pig.impl.eval.GenerateSpec$CrossProductItem.add(GenerateSpec.java:230)
              at org.apache.pig.impl.eval.collector.UnflattenCollector.add(UnflattenCollector.java:52)
              at org.apache.pig.impl.eval.collector.DataCollector.addToSuccessor(DataCollector.java:93)
              at org.apache.pig.impl.eval.SimpleEvalSpec$1.add(SimpleEvalSpec.java:35)
              at org.apache.pig.impl.eval.GenerateSpec$CrossProductItem.exec(GenerateSpec.java:261)
              at org.apache.pig.impl.eval.GenerateSpec$1.add(GenerateSpec.java:86)
              at org.apache.pig.backend.hadoop.executionengine.mapreduceExec.PigMapReduce.run(PigMapReduce.java:113)
              at org.apache.hadoop.mapred.MapTask.run(MapTask.java:192)
              at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1760)
      
      2008-03-11 18:57:59,469 [main] ERROR org.apache.pig.backend.hadoop.executionengine.mapreduceExec.MapReduceLauncher - Error message from task (reduce) tip_200802211201_1494_r_000000
      2008-03-11 18:57:59,470 [main] ERROR org.apache.pig.tools.grunt.GruntParser - Unable to store alias D
      

      Attachments

        Activity

          People

            Unassigned Unassigned
            xuzh Xu Zhang
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: