Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-5372

SAMPLE/RANDOM(udf) before skewed join failing with NPE

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 0.16.0
    • 0.17.1
    • None
    • None
    • Reviewed

    Description

      Sample short code like below

      A = LOAD 'input.txt' AS (a1:int, a2:chararray, a3:int);
      B = LOAD 'input.txt' AS (b1:int, b2:chararray, b3:int);
      
      A2 = FOREACH A generate *, RANDOM() as randnum;
      
      D = join A2 by a1, B by b1 using 'skewed' parallel 2;
      
      store D into '$output';
      

      Fails with NPE.

      2018-12-12 16:06:04,860 [Dispatcher thread: Central] INFO  org.apache.tez.dag.history.HistoryEventHandler - [HISTORY][DAG:dag_1544648742542_0001_1][Event:TASK_FINISHED]: vertexName=scope-55, taskId=task_1544648742542_0001_1_02_000000, startTime=1544648745036, finishTime=1544648764857, timeTaken=19821, status=KILLED, successfulAttemptID=null, diagnostics=TaskAttempt 0 failed, info=[Error: Failure while running task:org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception while executing (Name: Local Rearrange[tuple]{int}(false) - scope-29 ->       scope-58 Operator Key: scope-29): org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception while executing [POUserFunc (Name: POUserFunc(org.apache.pig.builtin.RANDOM)[double] - scope-40 Operator Key: scope-40) children: null at []]: java.lang.NullPointerException
              at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:315)
              at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNextTuple(POLocalRearrange.java:287)
              at org.apache.pig.backend.hadoop.executionengine.tez.plan.operator.POLocalRearrangeTez.getNextTuple(POLocalRearrangeTez.java:131)
              at org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.runPipeline(PigProcessor.java:420)
              at org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.run(PigProcessor.java:282)
              at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:337)
              at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179)
              at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)
              at java.security.AccessController.doPrivileged(Native Method)
              at javax.security.auth.Subject.doAs(Subject.java:422)
              at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
              at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171)
              at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167)
              at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
              at java.util.concurrent.FutureTask.run(FutureTask.java:266)
              at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
              at java.lang.Thread.run(Thread.java:745)
      Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception while executing [POUserFunc (Name: POUserFunc(org.apache.pig.builtin.RANDOM)[double] - scope-40 Operator Key: scope-40) children: null at []]: java.lang.NullPointerException
              at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:367)
              at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:408)
              at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNextTuple(POForEach.java:325)
              at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:305)
              ... 17 more
      Caused by: java.lang.NullPointerException
              at org.apache.pig.builtin.RANDOM.exec(RANDOM.java:51)
              at org.apache.pig.builtin.RANDOM.exec(RANDOM.java:37)
              at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:332)
              at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNextDouble(POUserFunc.java:396)
              at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:343)
              ... 20 more
      ]
      

      Attachments

        1. pig-5372-v1.patch
          6 kB
          Koji Noguchi
        2. pig-5372-v2.patch
          7 kB
          Koji Noguchi

        Activity

          People

            knoguchi Koji Noguchi
            knoguchi Koji Noguchi
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: