Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-3068

hadoop streaming tasks hang for when stream.non.zero.exit.is.failure==true and reduce processes exit with non zero status

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Duplicate
    • None
    • None
    • None
    • None
    • Java(TM) SE Runtime Environment (build 1.6.0_04-b12); Hadoop version 0.17.0-dev, r639662

    Description

      When I set stream.non.zero.exit.is.failure to true and run a streaming job with reducers that exit with a non-zero status, those tasks fail apparently waiting for something.

      ...
      2008-03-21 13:33:53,715 INFO org.apache.hadoop.streaming.PipeMapRed: R/W/S=65501/1/0 in:334=65501/196 [rec/s] out:0=1/196 [rec/s]
      2008-03-21 13:33:53,719 INFO org.apache.hadoop.streaming.PipeMapRed: mapRedFinished
      2008-03-21 13:34:11,228 INFO org.apache.hadoop.streaming.PipeMapRed: Records R/W=65536/2
      2008-03-21 13:34:11,235 INFO org.apache.hadoop.streaming.PipeMapRed: PipeMapRed.waitOutputThreads(): subprocess exitted with code 1
      2008-03-21 13:34:11,235 INFO org.apache.hadoop.streaming.PipeMapRed: MRErrorThread done
      2008-03-21 13:34:11,238 INFO org.apache.hadoop.streaming.PipeMapRed: MROutputThread done
      2008-03-21 13:34:11,245 WARN org.apache.hadoop.mapred.TaskTracker: Error running child
      java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1
      at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:331)
      at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:475)
      at org.apache.hadoop.streaming.PipeReducer.close(PipeReducer.java:110)
      at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:399)
      at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2113)

      After that the task still shows up with status:Running, but it just hangs there and when/if all tasks get into this state, the whole cluster hangs.

      BTW, may I suggest that we make stream.non.zero.exit.is.failure default to true after this is fixed?

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              yurip Yuri Pradkin
              Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: