Issue Details (XML | Word | Printable)

Key: HADOOP-5210
Type: Bug Bug
Status: Resolved Resolved
Resolution: Fixed
Priority: Minor Minor
Assignee: Ravi Gummadi
Reporter: Jothi Padmanabhan
Votes: 1
Watchers: 5
Operations

If you were logged in you would be able to see more operations.
Hadoop Common

Reduce Task Progress shows > 100% when the total size of map outputs (for a single reducer) is high

Created: 10/Feb/09 07:36 AM   Updated: 20/May/09 09:54 AM
Return to search
Component/s: None
Affects Version/s: None
Fix Version/s: 0.20.1

Time Tracking:
Not Specified

File Attachments:
  Size
Text File Licensed for inclusion in ASF works HADOOP-5210.patch 2009-03-12 11:16 AM Ravi Gummadi 0.6 kB
Text File Licensed for inclusion in ASF works HADOOP-5210.v2.1.patch 2009-03-19 04:06 AM Ravi Gummadi 1 kB
Text File Licensed for inclusion in ASF works HADOOP-5210.v2.patch 2009-03-16 09:58 AM Ravi Gummadi 0.7 kB
Text File Licensed for inclusion in ASF works HADOOP-5210.v3.patch 2009-03-20 04:28 AM Ravi Gummadi 2 kB
Image Attachments:

1. Picture 3.png
(27 kB)
Issue Links:
Reference
 

Hadoop Flags: Reviewed
Release Note: This patch resets the variable totalBytesProcessed before the final merge sothat it will be used for calculating the progress of reducePhase(the 3rd phase of reduce task) correctly.
Resolution Date: 25/Mar/09 09:00 AM


 Description  « Hide
When the total map outputs size (reduce input size) is high, the reported progress is greater than 100%.

 All   Comments   Work Log   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Jothi Padmanabhan added a comment - 10/Feb/09 07:37 AM
Screen shot showing progress > 100%

Devaraj Das added a comment - 18/Feb/09 06:11 AM
This could be because of the way we compute mergeProgress during merges in the reduce. The mergeProgress is a function of the totalBytesProcessed and the totalBytesProcessed is incremented for every segment considered during merge. So if we have multi-level merges, we would run into a case where we report more progress per byte since many bytes would make hit the disk but they would be again considered for the next level merge and so on..

Doug Cook added a comment - 11/Mar/09 05:43 PM
I've seen this problem, too. Am happy to send my local config info if that's useful.

Ravi Gummadi added a comment - 12/Mar/09 11:16 AM
As Devaraj mentioned, the problem is in the calculation of mergeProgress when multi-level merges happen.
Attaching patch that fixes the issue. Please review and provide your comments.

Jothi Padmanabhan added a comment - 13/Mar/09 11:40 AM
Carrying over bytes from intermediate merges to the final reduce phase is not correct.

Ravi Gummadi added a comment - 16/Mar/09 09:58 AM
Yes Jothi. When the intermediate merges complete, we can say that the sortPhase is completed and if we reset the variable totalBytesProcessed before the final merge, we can use that for calculating the progress of reducePhase(the 3rd phase of reduce task). Patch of HADOOP-3131 removed this resetting of totalBytesProcessed.

Matei, Would you please check if your patch(of JIRA 3131) removed this reset intentionally and if I am missing out something ?

Attaching patch which resets the bytes-processed to zero before final merge.
Please review and provide your comments.


Ravi Gummadi added a comment - 19/Mar/09 04:06 AM
Jothi offline suggested to remove some unnecessary code from merge().

Attaching new patch with that change.


Ravi Gummadi added a comment - 20/Mar/09 04:28 AM
TestReduceTask was failing with earlier patch because of ignoring the starting bytes read from segments in the final merge.

Attaching the patch that resets totalBytesProcessed to the number of bytes read in this final merge(instead of 0).

Please review and provide your comments.


Hadoop QA added a comment - 20/Mar/09 11:00 PM
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12402622/HADOOP-5210.v3.patch
against trunk revision 756352.

+1 @author. The patch does not contain any @author tags.

-1 tests included. The patch doesn't appear to include any new or modified tests.
Please justify why no tests are needed for this patch.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac compiler warnings.

+1 findbugs. The patch does not introduce any new Findbugs warnings.

+1 Eclipse classpath. The patch retains Eclipse classpath integrity.

+1 release audit. The applied patch does not increase the total number of release audit warnings.

+1 core tests. The patch passed core unit tests.

+1 contrib tests. The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/114/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/114/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/114/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/114/console

This message is automatically generated.


Jothi Padmanabhan added a comment - 23/Mar/09 09:37 AM
+1. Patch looks good.

Devaraj Das added a comment - 25/Mar/09 09:00 AM
I just committed this. Thanks, Ravi!

Hudson added a comment - 25/Mar/09 08:18 PM
Integrated in Hadoop-trunk #790 (See http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/790/)
. Solves a problem in the progress report of the reduce task. Contributed by Ravi Gummadi.

Ravi Gummadi added a comment - 15/May/09 06:15 AM
It would be nice if this patch can be committed to branch 0.20 also.
The same patch applies to branch 0.20 also.
Devaraj, Would you please commit this to 0.20 ?

Devaraj Das added a comment - 20/May/09 09:54 AM
I committed this to the 0.20 branch.