Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-3446 Umbrella jira for Pig on Tez
  3. PIG-4112

NPE in packager when union + group-by followed by replicated join in Tez

    XMLWordPrintableJSON

    Details

    • Type: Sub-task
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.14.0
    • Component/s: tez
    • Labels:
      None

      Description

      To reproduce the error, run the following query-

      A = load 'foo' as (id:int, fruit);
      B = load 'foo' as (id:int, fruit);
      C = union A, B;
      D = group C by id;
      E = load 'foo' as (id:int, fruit);
      F = join D by group, E by id using 'replicated';
      dump F;
      

      Here is the stack trace-

      Error: Failure while running task:java.lang.NullPointerException
      : at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.Packager.getValueTuple(Packager.java:215)
      : at org.apache.pig.backend.hadoop.executionengine.tez.POShuffleTezLoad.getNextTuple(POShuffleTezLoad.java:179)
      : at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:301)
      : at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POFRJoin.getNextTuple(POFRJoin.java:270)
      : at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:301)
      : at org.apache.pig.backend.hadoop.executionengine.tez.POStoreTez.getNextTuple(POStoreTez.java:113)
      : at org.apache.pig.backend.hadoop.executionengine.tez.PigProcessor.runPipeline(PigProcessor.java:317)
      : at org.apache.pig.backend.hadoop.executionengine.tez.PigProcessor.run(PigProcessor.java:196)
      : at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324)
      : at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:180)
      : at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:172)
      : at java.security.AccessController.doPrivileged(Native Method)
      : at javax.security.auth.Subject.doAs(Subject.java:415)
      : at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
      : at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:172)
      : at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:167)
      : at java.util.concurrent.FutureTask.run(FutureTask.java:262)
      : at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
      : at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      : at java.lang.Thread.run(Thread.java:744)
      

        Attachments

        1. PIG-4112-1.patch
          2 kB
          Cheolsoo Park

          Activity

            People

            • Assignee:
              rohini Rohini Palaniswamy
              Reporter:
              cheolsoo Cheolsoo Park
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: