Uploaded image for project: 'Hadoop Map/Reduce'
  1. Hadoop Map/Reduce
  2. MAPREDUCE-969

NullPointerException during reduce freezes job

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Duplicate
    • 0.20.2
    • None
    • jobtracker, task, tasktracker
    • None

    Description

      We experienced several jobs stuck in Reduce on a cluster. All of the stuck reduce tasks had a similar were stuck at "Need another 2 map output(s) where 0 is already in progress" despite all of the mappers having completed, and 0 scheduled. The stuck reducers had experienced the following exception early in the shuffle:

      java.lang.NullPointerException
      at java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:768)
      at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$GetMapEventsThread.getMapCompletionEvents(ReduceTask.java:2747)
      at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$GetMapEventsThread.run(ReduceTask.java:2670)

      Will attach more information and logs momentarily.

      Attachments

        1. bad_job_events
          124 kB
          Todd Lipcon
        2. bad_job_jt_logs
          399 kB
          Todd Lipcon
        3. reduce_task_logs
          66 kB
          Todd Lipcon

        Activity

          People

            tlipcon Todd Lipcon
            tlipcon Todd Lipcon
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: