Hadoop Common
  1. Hadoop Common
  2. HADOOP-4620

Streaming mapper never completes if the mapper does not write to stdout

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.17.2
    • Fix Version/s: 0.18.3
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    • Release Note:
      Hide
      This patch HADOOP-4620.patch
      (1) solves the hanging problem on map side with empty input and nonempty output — this map task generates output properly to intermediate files similar to other map tasks.
      (2) solves the problem of hanging reducer with empty input to reduce task and nonempty output — this reduce task doesn't generate output if input to reduce task is empty.
      Show
      This patch HADOOP-4620 .patch (1) solves the hanging problem on map side with empty input and nonempty output — this map task generates output properly to intermediate files similar to other map tasks. (2) solves the problem of hanging reducer with empty input to reduce task and nonempty output — this reduce task doesn't generate output if input to reduce task is empty.

      Description

      A mapper of a streaming job has empty input data and thus it produces no output.
      The task never completes.

      The following are the last two lines from the task log:
      2008-11-07 21:59:48,254 INFO org.apache.hadoop.streaming.PipeMapRed: PipeMapRed exec [/usr/bin/perl, xxx]
      2008-11-07 21:59:48,330 INFO org.apache.hadoop.streaming.PipeMapRed: mapRedFinished

      1. solves_mapper_4620.patch
        5 kB
        Ravi Gummadi
      2. HADOOP-4620.patch
        10 kB
        Ravi Gummadi
      3. HADOOP17-4620.patch
        10 kB
        Ravi Gummadi

        Issue Links

          Activity

          Runping Qi created issue -
          Hide
          Runping Qi added a comment -

          I double checked the jobs behaving like that.
          The mapping child process (the perl script) did not exit (thus the task never completed because the user process never completed).
          I think the messages in the log are misleading.
          Maybe we just need to fix the logging part.

          Show
          Runping Qi added a comment - I double checked the jobs behaving like that. The mapping child process (the perl script) did not exit (thus the task never completed because the user process never completed). I think the messages in the log are misleading. Maybe we just need to fix the logging part.
          Ravi Gummadi made changes -
          Field Original Value New Value
          Assignee Ravi Gummadi [ ravidotg ]
          Hide
          Ravi Gummadi added a comment -

          So there is no issue of tasks hanging forever(even if the size of the map output is 0 bytes). I agree that the message "mapRedFinished" is displayed before output thread and error thread have completed their work. Log messages will be changed to be displayed appopriately.

          Otherwise, please provide a testcase that reproduces the problem of Map Task hanging. I am not able to reproduce the problem of hanging map task.

          Show
          Ravi Gummadi added a comment - So there is no issue of tasks hanging forever(even if the size of the map output is 0 bytes). I agree that the message "mapRedFinished" is displayed before output thread and error thread have completed their work. Log messages will be changed to be displayed appopriately. Otherwise, please provide a testcase that reproduces the problem of Map Task hanging. I am not able to reproduce the problem of hanging map task.
          Hide
          Koji Noguchi added a comment -

          I've seen streaming hang when

          1. Empty input
          2. Streaming process still tries to output.

          This is because MROutputThread is only created when first record is passed to the map().
          No input, thus no map(), thus no MROutputThread, and streaming-process stdout write hang forever.
          (in 0.17, we still had the timeout of 0(infinity))

          Show
          Koji Noguchi added a comment - I've seen streaming hang when Empty input Streaming process still tries to output. This is because MROutputThread is only created when first record is passed to the map(). No input, thus no map(), thus no MROutputThread, and streaming-process stdout write hang forever. (in 0.17, we still had the timeout of 0(infinity))
          Hide
          Ruyue Ma added a comment -

          Hi, Ravi

          You can re-produce the problem.

          copy a zero-length file to hdfs. Then run streaming job for this file. You know mapper will handle zero-length data, if the mapper still produce out, if the out is larger than 4KB, the mapper task will hang.

          The reason is: if mapper input data is zero, streaming would not start map output thread, but open the mapper's stdout pipe, the pipe's size defaultly is 4KB. If mapper doesn't generate data large than 4KB, mapper will end normally. But if larger than 4KB, it will hang.

          If you guys think it is bug, I can submit a patch to correct it.

          As far as I know, the latest revision resolve the problem for stderr output thread, but not for stdout output thread.

          Thanks

          Show
          Ruyue Ma added a comment - Hi, Ravi You can re-produce the problem. copy a zero-length file to hdfs. Then run streaming job for this file. You know mapper will handle zero-length data, if the mapper still produce out, if the out is larger than 4KB, the mapper task will hang. The reason is: if mapper input data is zero, streaming would not start map output thread, but open the mapper's stdout pipe, the pipe's size defaultly is 4KB. If mapper doesn't generate data large than 4KB, mapper will end normally. But if larger than 4KB, it will hang. If you guys think it is bug, I can submit a patch to correct it. As far as I know, the latest revision resolve the problem for stderr output thread, but not for stdout output thread. Thanks
          Hide
          Ravi Gummadi added a comment -

          Thanks. I could reproduce the problem now.

          Show
          Ravi Gummadi added a comment - Thanks. I could reproduce the problem now.
          Hide
          Ravi Gummadi added a comment -

          The ideal behaviour could be to have the correct output in this case also(input is of 0 size for mapper/reducer and the task generates output).

          Is it good enough that the mapper/reducer doesn't hang and finish execution in the case of 0 sized input(even though the output of the mapper/reducer is not available in intermediateFile/HDFS at the end of the task) ?

          Show
          Ravi Gummadi added a comment - The ideal behaviour could be to have the correct output in this case also(input is of 0 size for mapper/reducer and the task generates output). Is it good enough that the mapper/reducer doesn't hang and finish execution in the case of 0 sized input(even though the output of the mapper/reducer is not available in intermediateFile/HDFS at the end of the task) ?
          Hide
          Ruyue Ma added a comment -

          I agree with Ravi.

          if mapper will generate data, we will allow it, or report error. Hang is not accepted!

          Thanks

          Show
          Ruyue Ma added a comment - I agree with Ravi. if mapper will generate data, we will allow it, or report error. Hang is not accepted! Thanks
          Hide
          Ravi Gummadi added a comment -

          Adding the following new class in streaming and starting the output thread in its run() method solves the problem of mapper hanging(when 0 byte input & nonzero output).

          +public class PipeMapRunner<K1, V1, K2, V2> extends MapRunner<K1, V1, K2, V2> {
          + public void run(RecordReader<K1, V1> input, OutputCollector<K2, V2> output,
          + Reporter reporter)
          + throws IOException

          { + PipeMapper pipeMapper = (PipeMapper)getMapper(); + pipeMapper.startOutputThreads(output, reporter); + super.run(input, output, reporter); + }

          +}

          Investigating if similar thing can be done in Reduce phase also(Reducer also hangs if input is of size 0 bytes and if it produces output).
          Thoughts ?

          Show
          Ravi Gummadi added a comment - Adding the following new class in streaming and starting the output thread in its run() method solves the problem of mapper hanging(when 0 byte input & nonzero output). +public class PipeMapRunner<K1, V1, K2, V2> extends MapRunner<K1, V1, K2, V2> { + public void run(RecordReader<K1, V1> input, OutputCollector<K2, V2> output, + Reporter reporter) + throws IOException { + PipeMapper pipeMapper = (PipeMapper)getMapper(); + pipeMapper.startOutputThreads(output, reporter); + super.run(input, output, reporter); + } +} Investigating if similar thing can be done in Reduce phase also(Reducer also hangs if input is of size 0 bytes and if it produces output). Thoughts ?
          Hide
          Runping Qi added a comment -

          Why not simply start the MROutThread in the config function in the same way as for MRErrorThread?

          Show
          Runping Qi added a comment - Why not simply start the MROutThread in the config function in the same way as for MRErrorThread?
          Hide
          Ravi Gummadi added a comment -

          Hi Runping, This is because OutputCollector and Reporter are parameters to MROutputThread(), but they are not available in configure().

          Show
          Ravi Gummadi added a comment - Hi Runping, This is because OutputCollector and Reporter are parameters to MROutputThread(), but they are not available in configure().
          Hide
          Runping Qi added a comment -

          I see.

          If we take this approach, the StreamJob needs to set the map runner class too when it sets the mapper class.

          Show
          Runping Qi added a comment - I see. If we take this approach, the StreamJob needs to set the map runner class too when it sets the mapper class.
          Hide
          Ravi Gummadi added a comment -

          Yes Runping. I have a patch ready with those changes(including setting map runner in StreamJob). Its working fine for mappers — not hanging and giving proper output. Investigating the similar issue on the reducer side.

          Show
          Ravi Gummadi added a comment - Yes Runping. I have a patch ready with those changes(including setting map runner in StreamJob). Its working fine for mappers — not hanging and giving proper output. Investigating the similar issue on the reducer side.
          Hide
          Ruyue Ma added a comment - - edited

          if mapper has no data to handle. We could not start native process. This will improve performance and avoid this problem.

          so i suggest that it should start the native process in PiperMap->map().

          we can move PipeMapRed part code to map function

          // Start the process
          ProcessBuilder builder = new ProcessBuilder(argvSplit);
          builder.environment().putAll(childEnv.toMap());
          sim = builder.start();

          clientOut_ = new DataOutputStream(new BufferedOutputStream(
          sim.getOutputStream(),
          BUFFER_SIZE));
          clientIn_ = new DataInputStream(new BufferedInputStream(
          sim.getInputStream(),
          BUFFER_SIZE));
          clientErr_ = new DataInputStream(new BufferedInputStream(sim.getErrorStream()));
          startTime_ = System.currentTimeMillis();

          errThread_ = new MRErrorThread();
          errThread_.start();

          Show
          Ruyue Ma added a comment - - edited if mapper has no data to handle. We could not start native process. This will improve performance and avoid this problem. so i suggest that it should start the native process in PiperMap->map(). we can move PipeMapRed part code to map function // Start the process ProcessBuilder builder = new ProcessBuilder(argvSplit); builder.environment().putAll(childEnv.toMap()); sim = builder.start(); clientOut_ = new DataOutputStream(new BufferedOutputStream( sim.getOutputStream(), BUFFER_SIZE)); clientIn_ = new DataInputStream(new BufferedInputStream( sim.getInputStream(), BUFFER_SIZE)); clientErr_ = new DataInputStream(new BufferedInputStream(sim.getErrorStream())); startTime_ = System.currentTimeMillis(); errThread_ = new MRErrorThread(); errThread_.start();
          Hide
          Devaraj Das added a comment -

          if mapper has no data to handle. We could not start native process. This will improve performance and avoid this problem.

          -1. I think we should start the native process anyways.

          Show
          Devaraj Das added a comment - if mapper has no data to handle. We could not start native process. This will improve performance and avoid this problem. -1. I think we should start the native process anyways.
          Hide
          Ravi Gummadi added a comment -

          The attached patch solves the hanging issue on map side only . It doesn't solve the problem of reducer hanging when 0 sized input is given and reducer generates nonzero sized output.

          Show
          Ravi Gummadi added a comment - The attached patch solves the hanging issue on map side only . It doesn't solve the problem of reducer hanging when 0 sized input is given and reducer generates nonzero sized output.
          Ravi Gummadi made changes -
          Attachment solves_mapper_4620.patch [ 12395370 ]
          Hide
          Sharad Agarwal added a comment -

          Had a discussion on this Devaraj. It seems there aren't reasonable usecase which may require outputting from reducer with no input. It should be ok to not output anything from reducer with no input. However hang problem should be fixed for it as well.

          Show
          Sharad Agarwal added a comment - Had a discussion on this Devaraj. It seems there aren't reasonable usecase which may require outputting from reducer with no input. It should be ok to not output anything from reducer with no input. However hang problem should be fixed for it as well.
          Hide
          Devaraj Das added a comment -

          I'd propose that for Reducers, we solve only the "hang" problem for the empty input case. I don't think it is required to support the collector for reduces with empty inputs. A reducer is supposed to aggregate/reduce data that the maps generate for it and if the maps didn't generate anything for a given reducer it seems okay that it should not generate any output.
          For the maps, the picture is a bit different with empty inputs and we should support it. The use case here is Hadoop enables parallelizing the native application (where the application could be reading its input off some source which Hadoop is not aware of), or, it just enables running multiple instances of the native application on a cluster. This use case might set the number of reduces to 0.
          What do others think?

          Show
          Devaraj Das added a comment - I'd propose that for Reducers, we solve only the "hang" problem for the empty input case. I don't think it is required to support the collector for reduces with empty inputs. A reducer is supposed to aggregate/reduce data that the maps generate for it and if the maps didn't generate anything for a given reducer it seems okay that it should not generate any output. For the maps, the picture is a bit different with empty inputs and we should support it. The use case here is Hadoop enables parallelizing the native application (where the application could be reading its input off some source which Hadoop is not aware of), or, it just enables running multiple instances of the native application on a cluster. This use case might set the number of reduces to 0. What do others think?
          Hide
          Runping Qi added a comment -

          +1

          Show
          Runping Qi added a comment - +1
          Hide
          Ravi Gummadi added a comment -

          Attached the patch HADOOP-4620.patch that
          (1) solves the hanging problem on map side with empty input and nonempty output — generates output properly to intermediate files similar to other map tasks.
          (2) solves the problem of hanging reducer with empty input to reduce task and nonempty output — doesn't generate output if input to reduce task is empty.

          Please review the patch and provide your comments. Thanks.

          Show
          Ravi Gummadi added a comment - Attached the patch HADOOP-4620 .patch that (1) solves the hanging problem on map side with empty input and nonempty output — generates output properly to intermediate files similar to other map tasks. (2) solves the problem of hanging reducer with empty input to reduce task and nonempty output — doesn't generate output if input to reduce task is empty. Please review the patch and provide your comments. Thanks.
          Ravi Gummadi made changes -
          Attachment HADOOP-4620.patch [ 12395409 ]
          Hide
          Ravi Gummadi added a comment -

          Submitting the patch....

          Show
          Ravi Gummadi added a comment - Submitting the patch....
          Ravi Gummadi made changes -
          Release Note This patch HADOOP-4620.patch
          (1) solves the hanging problem on map side with empty input and nonempty output — this map task generates output properly to intermediate files similar to other map tasks.
          (2) solves the problem of hanging reducer with empty input to reduce task and nonempty output — this reduce task doesn't generate output if input to reduce task is empty.
          Status Open [ 1 ] Patch Available [ 10002 ]
          Hide
          Ravi Gummadi added a comment -

          The patch for trunk(HADOOP-4620.patch) applies to version 18 also without error.
          Attaching now the patch for version-0.17 HADOOP17-4620.patch.

          Show
          Ravi Gummadi added a comment - The patch for trunk( HADOOP-4620 .patch) applies to version 18 also without error. Attaching now the patch for version-0.17 HADOOP17-4620.patch.
          Ravi Gummadi made changes -
          Attachment HADOOP17-4620.patch [ 12395526 ]
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12395526/HADOOP17-4620.patch
          against trunk revision 724578.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          -1 patch. The patch command could not apply the patch.

          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3695/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12395526/HADOOP17-4620.patch against trunk revision 724578. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3695/console This message is automatically generated.
          Hide
          Ravi Gummadi added a comment -

          HADOOP17-4620.patch is for branch 17. Patch HADOOP-4620.patch applies fine to trunk and version 0.18.2
          Here are the results of testing HADOOP-4620.patch :

          +1 overall.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

          Unit tests also passed on my machine.

          Show
          Ravi Gummadi added a comment - HADOOP17-4620.patch is for branch 17. Patch HADOOP-4620 .patch applies fine to trunk and version 0.18.2 Here are the results of testing HADOOP-4620 .patch : +1 overall. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 Eclipse classpath. The patch retains Eclipse classpath integrity. Unit tests also passed on my machine.
          Hide
          Sharad Agarwal added a comment -

          +1

          Show
          Sharad Agarwal added a comment - +1
          Hide
          Owen O'Malley added a comment -

          +1 for putting this into 0.18.3

          Show
          Owen O'Malley added a comment - +1 for putting this into 0.18.3
          Hide
          Devaraj Das added a comment -

          I just committed this to 0.18, 0.19 branches and trunk. Thanks, Ravi!

          Show
          Devaraj Das added a comment - I just committed this to 0.18, 0.19 branches and trunk. Thanks, Ravi!
          Devaraj Das made changes -
          Hadoop Flags [Reviewed]
          Resolution Fixed [ 1 ]
          Status Patch Available [ 10002 ] Resolved [ 5 ]
          Devaraj Das made changes -
          Description
          A mapper of a streaming job has empty input data and thus it produces no output.
          The task never completes.

          The following are the last two lines from the task log:
          2008-11-07 21:59:48,254 INFO org.apache.hadoop.streaming.PipeMapRed: PipeMapRed exec [/usr/bin/perl, xxx]
          2008-11-07 21:59:48,330 INFO org.apache.hadoop.streaming.PipeMapRed: mapRedFinished
           
          A mapper of a streaming job has empty input data and thus it produces no output.
          The task never completes.

          The following are the last two lines from the task log:
          2008-11-07 21:59:48,254 INFO org.apache.hadoop.streaming.PipeMapRed: PipeMapRed exec [/usr/bin/perl, xxx]
          2008-11-07 21:59:48,330 INFO org.apache.hadoop.streaming.PipeMapRed: mapRedFinished
           
          Fix Version/s 0.18.3 [ 12313494 ]
          Nigel Daley made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          Owen O'Malley made changes -
          Component/s mapred [ 12310690 ]
          Ravi Gummadi made changes -
          Link This issue relates to MAPREDUCE-1813 [ MAPREDUCE-1813 ]
          Transition Time In Source Status Execution Times Last Executer Last Execution Date
          Open Open Patch Available Patch Available
          27d 13h 5m 1 Ravi Gummadi 05/Dec/08 13:58
          Patch Available Patch Available Resolved Resolved
          6d 15h 34m 1 Devaraj Das 12/Dec/08 05:32
          Resolved Resolved Closed Closed
          49d 14h 41m 1 Nigel Daley 30/Jan/09 20:14

            People

            • Assignee:
              Ravi Gummadi
              Reporter:
              Runping Qi
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development