Pig
  1. Pig
  2. PIG-2963

Illustrate command and POPackageLite

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Critical Critical
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.11
    • Component/s: None
    • Labels:
      None

      Description

      While trying to execute a simple script like:

      A = LOAD 'test01' AS (f1:chararray,f2:int,f3:chararray);
      B = order A by f1;
      illustrate B;

      or

      C = foreach B generate f1, f2;
      illustrate C;

      I got the following exception:

      java.lang.RuntimeException: ReadOnceBag does not support getMemorySize operation
      at org.apache.pig.data.ReadOnceBag.getMemorySize(ReadOnceBag.java:74)
      at org.apache.pig.data.SizeUtil.getPigObjMemSize(SizeUtil.java:61)
      at org.apache.pig.data.DefaultTuple.getMemorySize(DefaultTuple.java:180)
      at org.apache.pig.pen.util.ExampleTuple.getMemorySize(ExampleTuple.java:97)
      at org.apache.pig.data.DefaultAbstractBag.getMemorySize(DefaultAbstractBag.java:148)
      at org.apache.pig.data.DefaultAbstractBag.markSpillableIfNecessary(DefaultAbstractBag.java:100)
      at org.apache.pig.data.DefaultAbstractBag.add(DefaultAbstractBag.java:92)
      at org.apache.pig.pen.Illustrator.addData(Illustrator.java:116)
      at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POPackageLite.illustratorMarkup(POPackageLite.java:227)
      at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POPackageLite.getNext(POPackageLite.java:182)
      at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.processOnePackageOutput(PigGenericMapReduce.java:422)
      at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.reduce(PigGenericMapReduce.java:413)
      at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.reduce(PigGenericMapReduce.java:257)
      at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:176)
      at org.apache.pig.pen.LocalMapReduceSimulator.launchPig(LocalMapReduceSimulator.java:235)
      at org.apache.pig.pen.ExampleGenerator.getData(ExampleGenerator.java:257)
      at org.apache.pig.pen.ExampleGenerator.getData(ExampleGenerator.java:238)
      at org.apache.pig.pen.LineageTrimmingVisitor.init(LineageTrimmingVisitor.java:103)
      at org.apache.pig.pen.LineageTrimmingVisitor.<init>(LineageTrimmingVisitor.java:98)
      at org.apache.pig.pen.ExampleGenerator.getExamples(ExampleGenerator.java:166)
      at org.apache.pig.PigServer.getExamples(PigServer.java:1180)
      at org.apache.pig.tools.grunt.GruntParser.processIllustrate(GruntParser.java:738)
      at org.apache.pig.tools.pigscript.parser.PigScriptParser.Illustrate(PigScriptParser.java:626)
      at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:323)
      at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:193)
      at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:169)
      at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:69)
      at org.apache.pig.Main.run(Main.java:538)
      at org.apache.pig.Main.main(Main.java:154)
      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
      at java.lang.reflect.Method.invoke(Method.java:597)
      at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
      ======================================================================

      At log file, the following:

      Pig Stack Trace
      ---------------
      ERROR 2997: Encountered IOException. Exception

      java.io.IOException: Exception
      at org.apache.pig.PigServer.getExamples(PigServer.java:1186)
      at org.apache.pig.tools.grunt.GruntParser.processIllustrate(GruntParser.java:738)
      at org.apache.pig.tools.pigscript.parser.PigScriptParser.Illustrate(PigScriptParser.java:626)
      at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:323)
      at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:193)
      at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:169)
      at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:69)
      at org.apache.pig.Main.run(Main.java:538)
      at org.apache.pig.Main.main(Main.java:154)
      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
      at java.lang.reflect.Method.invoke(Method.java:597)
      at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
      Caused by: java.lang.RuntimeException: ReadOnceBag does not support getMemorySize operation
      at org.apache.pig.data.ReadOnceBag.getMemorySize(ReadOnceBag.java:74)
      at org.apache.pig.data.SizeUtil.getPigObjMemSize(SizeUtil.java:61)
      at org.apache.pig.data.DefaultTuple.getMemorySize(DefaultTuple.java:180)
      at org.apache.pig.pen.util.ExampleTuple.getMemorySize(ExampleTuple.java:97)
      at org.apache.pig.data.DefaultAbstractBag.getMemorySize(DefaultAbstractBag.java:148)
      at org.apache.pig.data.DefaultAbstractBag.markSpillableIfNecessary(DefaultAbstractBag.java:100)
      at org.apache.pig.data.DefaultAbstractBag.add(DefaultAbstractBag.java:92)
      at org.apache.pig.pen.Illustrator.addData(Illustrator.java:116)
      at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POPackageLite.illustratorMarkup(POPackageLite.java:227)
      at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POPackageLite.getNext(POPackageLite.java:182)
      at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.processOnePackageOutput(PigGenericMapReduce.java:422)
      at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.reduce(PigGenericMapReduce.java:413)
      at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Reduce.reduce(PigGenericMapReduce.java:257)
      at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:176)
      at org.apache.pig.pen.LocalMapReduceSimulator.launchPig(LocalMapReduceSimulator.java:235)
      at org.apache.pig.pen.ExampleGenerator.getData(ExampleGenerator.java:257)
      at org.apache.pig.pen.ExampleGenerator.getData(ExampleGenerator.java:238)
      at org.apache.pig.pen.LineageTrimmingVisitor.init(LineageTrimmingVisitor.java:103)
      at org.apache.pig.pen.LineageTrimmingVisitor.<init>(LineageTrimmingVisitor.java:98)
      at org.apache.pig.pen.ExampleGenerator.getExamples(ExampleGenerator.java:166)
      at org.apache.pig.PigServer.getExamples(PigServer.java:1180)

      1. PIG-2963.patch
        0.5 kB
        Cheolsoo Park

        Issue Links

          Activity

          Hide
          Cheolsoo Park added a comment -

          This is a regression from PIG-2923. Verified that the error goes away by reverting it.

          What's happening is that illustrate adds to DefautAbstractBag a tuple that has a ReadOnceBag field.

          The problem is that PIG-2923 made DefautAbstractBag check whether or not it should spill to disk every time a new element is added to it. To compute the memory size of the new element, DefautAbstractBag iterates through every field of the tuple, in this case which is ReadOnceBag. Unfortunately, ReadOnceBag doesn't support getMemorySize() and throws a runtime exception.

          I am attaching a simple fix that makes ReadOnceBag.getMemorySize() return 0 instead of throwing a runtime exception. I am returning 0 here because the comment in ReadOnceBag says "this bag does not store the tuples in memory".

          I am not familiar with ReadOnceBag, so please correct me if this is not a proper fix.

          Thanks!

          Show
          Cheolsoo Park added a comment - This is a regression from PIG-2923 . Verified that the error goes away by reverting it. What's happening is that illustrate adds to DefautAbstractBag a tuple that has a ReadOnceBag field. The problem is that PIG-2923 made DefautAbstractBag check whether or not it should spill to disk every time a new element is added to it. To compute the memory size of the new element, DefautAbstractBag iterates through every field of the tuple, in this case which is ReadOnceBag. Unfortunately, ReadOnceBag doesn't support getMemorySize() and throws a runtime exception. I am attaching a simple fix that makes ReadOnceBag.getMemorySize() return 0 instead of throwing a runtime exception. I am returning 0 here because the comment in ReadOnceBag says "this bag does not store the tuples in memory". I am not familiar with ReadOnceBag, so please correct me if this is not a proper fix. Thanks!
          Hide
          Allan Avendaño added a comment -

          Thank Cheolsoo! It's working now.
          Now, I'm wondering which is the process to include it into the trunk.

          Thanks again!

          Show
          Allan Avendaño added a comment - Thank Cheolsoo! It's working now. Now, I'm wondering which is the process to include it into the trunk. Thanks again!
          Hide
          Jonathan Coveney added a comment -

          Allan: it comes down to pestering a committer to reviewing and committing, or just hoping they look at it.

          Which I just did. Thanks for the fix, Cheolsoo. It's in!

          Show
          Jonathan Coveney added a comment - Allan: it comes down to pestering a committer to reviewing and committing, or just hoping they look at it. Which I just did. Thanks for the fix, Cheolsoo. It's in!
          Hide
          Allan Avendaño added a comment -

          Thanks Jonathan for the explanation and for committing it!

          Show
          Allan Avendaño added a comment - Thanks Jonathan for the explanation and for committing it!

            People

            • Assignee:
              Cheolsoo Park
              Reporter:
              Allan Avendaño
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development