Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-6361

NPE issue in shuffle caused by concurrent issue between copySucceeded() in one thread and copyFailed() in another thread on the same host

    Details

    • Target Version/s:
    • Hadoop Flags:
      Reviewed

      Description

      The failure in log:
      2015-05-08 21:00:00,513 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in shuffle in fetcher#25
      at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134)
      at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:376)
      at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
      at java.security.AccessController.doPrivileged(Native Method)
      at javax.security.auth.Subject.doAs(Subject.java:415)
      at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
      at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
      Caused by: java.lang.NullPointerException
      at org.apache.hadoop.mapreduce.task.reduce.ShuffleSchedulerImpl.copyFailed(ShuffleSchedulerImpl.java:267)
      at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:308)
      at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:193)

        Attachments

          Activity

            People

            • Assignee:
              djp Junping Du
              Reporter:
              djp Junping Du
            • Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: