Issue Details (XML | Word | Printable)

Key: HADOOP-4620
Type: Bug Bug
Status: Closed Closed
Resolution: Fixed
Priority: Major Major
Assignee: Ravi Gummadi
Reporter: Runping Qi
Votes: 0
Watchers: 2
Operations

If you were logged in you would be able to see more operations.
Hadoop Common

Streaming mapper never completes if the mapper does not write to stdout

Created: 08/Nov/08 12:52 AM   Updated: 08/Jul/09 04:53 PM
Return to search
Component/s: None
Affects Version/s: 0.17.2
Fix Version/s: 0.18.3

Time Tracking:
Not Specified

File Attachments:
  Size
Text File Licensed for inclusion in ASF works HADOOP-4620.patch 2008-12-05 01:42 PM Ravi Gummadi 10 kB
Text File Licensed for inclusion in ASF works HADOOP17-4620.patch 2008-12-08 05:28 AM Ravi Gummadi 10 kB
Text File Licensed for inclusion in ASF works solves_mapper_4620.patch 2008-12-05 06:39 AM Ravi Gummadi 5 kB

Hadoop Flags: Reviewed
Release Note:
This patch HADOOP-4620.patch
(1) solves the hanging problem on map side with empty input and nonempty output — this map task generates output properly to intermediate files similar to other map tasks.
(2) solves the problem of hanging reducer with empty input to reduce task and nonempty output — this reduce task doesn't generate output if input to reduce task is empty.
Resolution Date: 12/Dec/08 05:32 AM


 Description  « Hide
A mapper of a streaming job has empty input data and thus it produces no output.
The task never completes.

The following are the last two lines from the task log:
2008-11-07 21:59:48,254 INFO org.apache.hadoop.streaming.PipeMapRed: PipeMapRed exec [/usr/bin/perl, xxx]
2008-11-07 21:59:48,330 INFO org.apache.hadoop.streaming.PipeMapRed: mapRedFinished



 All   Comments   Work Log   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Repository Revision Date User Message
ASF #725905 Fri Dec 12 05:17:23 UTC 2008 ddas HADOOP-4620. Fixes Streaming to handle well the cases of map/reduce with empty input/output. Contributed by Ravi Gummadi.
Files Changed
MODIFY /hadoop/core/trunk/CHANGES.txt
MODIFY /hadoop/core/trunk/src/contrib/streaming/src/java/org/apache/hadoop/streaming/PipeMapRed.java
MODIFY /hadoop/core/trunk/src/contrib/streaming/src/java/org/apache/hadoop/streaming/StreamJob.java
MODIFY /hadoop/core/trunk/src/mapred/org/apache/hadoop/mapred/MapRunner.java
MODIFY /hadoop/core/trunk/src/contrib/streaming/src/java/org/apache/hadoop/streaming/PipeMapper.java

Repository Revision Date User Message
ASF #725907 Fri Dec 12 05:18:36 UTC 2008 ddas HADOOP-4620. Committing the testcase that I forgot to add earlier.
Files Changed
ADD /hadoop/core/trunk/src/contrib/streaming/src/test/org/apache/hadoop/streaming/TestStreamingEmptyInpNonemptyOut.java

Repository Revision Date User Message
ASF #725908 Fri Dec 12 05:22:20 UTC 2008 ddas Merge -r 725906:725907 and 725904:725905 from trunk onto 0.19 branch. Fixes HADOOP-4620.
Files Changed
MODIFY /hadoop/core/branches/branch-0.19/src/contrib/streaming/src/java/org/apache/hadoop/streaming/PipeMapRed.java
MODIFY /hadoop/core/branches/branch-0.19/src/contrib/streaming/src/java/org/apache/hadoop/streaming/StreamJob.java
MODIFY /hadoop/core/branches/branch-0.19/src/mapred/org/apache/hadoop/mapred/MapRunner.java
ADD /hadoop/core/branches/branch-0.19/src/contrib/streaming/src/test/org/apache/hadoop/streaming/TestStreamingEmptyInpNonemptyOut.java (from /hadoop/core/trunk/src/contrib/streaming/src/test/org/apache/hadoop/streaming/TestStreamingEmptyInpNonemptyOut.java)
MODIFY /hadoop/core/branches/branch-0.19/src/contrib/streaming/src/java/org/apache/hadoop/streaming/PipeMapper.java
MODIFY /hadoop/core/branches/branch-0.19/CHANGES.txt

Repository Revision Date User Message
ASF #725910 Fri Dec 12 05:24:29 UTC 2008 ddas HADOOP-4620. Adding PipeMapRunner.java that I missed earlier.
Files Changed
ADD /hadoop/core/trunk/src/contrib/streaming/src/java/org/apache/hadoop/streaming/PipeMapRunner.java

Repository Revision Date User Message
ASF #725912 Fri Dec 12 05:32:06 UTC 2008 ddas Merge -r 725904:725905 725906:725907 725909:725910 from trunk onto 0.18 branch. Fixes HADOOP-4620.
Files Changed
MODIFY /hadoop/core/branches/branch-0.18/src/contrib/streaming/src/java/org/apache/hadoop/streaming/PipeMapRed.java
MODIFY /hadoop/core/branches/branch-0.18/src/mapred/org/apache/hadoop/mapred/MapRunner.java
ADD /hadoop/core/branches/branch-0.18/src/contrib/streaming/src/test/org/apache/hadoop/streaming/TestStreamingEmptyInpNonemptyOut.java (from /hadoop/core/trunk/src/contrib/streaming/src/test/org/apache/hadoop/streaming/TestStreamingEmptyInpNonemptyOut.java)
MODIFY /hadoop/core/branches/branch-0.18/src/contrib/streaming/src/java/org/apache/hadoop/streaming/PipeMapper.java
MODIFY /hadoop/core/branches/branch-0.18/CHANGES.txt
ADD /hadoop/core/branches/branch-0.18/src/contrib/streaming/src/java/org/apache/hadoop/streaming/PipeMapRunner.java (from /hadoop/core/trunk/src/contrib/streaming/src/java/org/apache/hadoop/streaming/PipeMapRunner.java)
MODIFY /hadoop/core/branches/branch-0.18/src/contrib/streaming/src/java/org/apache/hadoop/streaming/StreamJob.java