Hadoop Common
  1. Hadoop Common
  2. HADOOP-5210

Reduce Task Progress shows > 100% when the total size of map outputs (for a single reducer) is high

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.20.1
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    • Release Note:
      This patch resets the variable totalBytesProcessed before the final merge sothat it will be used for calculating the progress of reducePhase(the 3rd phase of reduce task) correctly.

      Description

      When the total map outputs size (reduce input size) is high, the reported progress is greater than 100%.

      1. HADOOP-5210.patch
        0.6 kB
        Ravi Gummadi
      2. HADOOP-5210.v2.1.patch
        1 kB
        Ravi Gummadi
      3. HADOOP-5210.v2.patch
        0.7 kB
        Ravi Gummadi
      4. HADOOP-5210.v3.patch
        2 kB
        Ravi Gummadi
      5. Picture 3.png
        27 kB
        Jothi Padmanabhan

        Issue Links

          Activity

          Hide
          Jothi Padmanabhan added a comment -

          Screen shot showing progress > 100%

          Show
          Jothi Padmanabhan added a comment - Screen shot showing progress > 100%
          Hide
          Devaraj Das added a comment -

          This could be because of the way we compute mergeProgress during merges in the reduce. The mergeProgress is a function of the totalBytesProcessed and the totalBytesProcessed is incremented for every segment considered during merge. So if we have multi-level merges, we would run into a case where we report more progress per byte since many bytes would make hit the disk but they would be again considered for the next level merge and so on..

          Show
          Devaraj Das added a comment - This could be because of the way we compute mergeProgress during merges in the reduce. The mergeProgress is a function of the totalBytesProcessed and the totalBytesProcessed is incremented for every segment considered during merge. So if we have multi-level merges, we would run into a case where we report more progress per byte since many bytes would make hit the disk but they would be again considered for the next level merge and so on..
          Hide
          Doug Cook added a comment -

          I've seen this problem, too. Am happy to send my local config info if that's useful.

          Show
          Doug Cook added a comment - I've seen this problem, too. Am happy to send my local config info if that's useful.
          Hide
          Ravi Gummadi added a comment -

          As Devaraj mentioned, the problem is in the calculation of mergeProgress when multi-level merges happen.
          Attaching patch that fixes the issue. Please review and provide your comments.

          Show
          Ravi Gummadi added a comment - As Devaraj mentioned, the problem is in the calculation of mergeProgress when multi-level merges happen. Attaching patch that fixes the issue. Please review and provide your comments.
          Hide
          Jothi Padmanabhan added a comment -

          Carrying over bytes from intermediate merges to the final reduce phase is not correct.

          Show
          Jothi Padmanabhan added a comment - Carrying over bytes from intermediate merges to the final reduce phase is not correct.
          Hide
          Ravi Gummadi added a comment -

          Yes Jothi. When the intermediate merges complete, we can say that the sortPhase is completed and if we reset the variable totalBytesProcessed before the final merge, we can use that for calculating the progress of reducePhase(the 3rd phase of reduce task). Patch of HADOOP-3131 removed this resetting of totalBytesProcessed.

          Matei, Would you please check if your patch(of JIRA 3131) removed this reset intentionally and if I am missing out something ?

          Attaching patch which resets the bytes-processed to zero before final merge.
          Please review and provide your comments.

          Show
          Ravi Gummadi added a comment - Yes Jothi. When the intermediate merges complete, we can say that the sortPhase is completed and if we reset the variable totalBytesProcessed before the final merge, we can use that for calculating the progress of reducePhase(the 3rd phase of reduce task). Patch of HADOOP-3131 removed this resetting of totalBytesProcessed. Matei, Would you please check if your patch(of JIRA 3131) removed this reset intentionally and if I am missing out something ? Attaching patch which resets the bytes-processed to zero before final merge. Please review and provide your comments.
          Hide
          Ravi Gummadi added a comment -

          Jothi offline suggested to remove some unnecessary code from merge().

          Attaching new patch with that change.

          Show
          Ravi Gummadi added a comment - Jothi offline suggested to remove some unnecessary code from merge(). Attaching new patch with that change.
          Hide
          Ravi Gummadi added a comment -

          TestReduceTask was failing with earlier patch because of ignoring the starting bytes read from segments in the final merge.

          Attaching the patch that resets totalBytesProcessed to the number of bytes read in this final merge(instead of 0).

          Please review and provide your comments.

          Show
          Ravi Gummadi added a comment - TestReduceTask was failing with earlier patch because of ignoring the starting bytes read from segments in the final merge. Attaching the patch that resets totalBytesProcessed to the number of bytes read in this final merge(instead of 0). Please review and provide your comments.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12402622/HADOOP-5210.v3.patch
          against trunk revision 756352.

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no tests are needed for this patch.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/114/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/114/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/114/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/114/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12402622/HADOOP-5210.v3.patch against trunk revision 756352. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 Eclipse classpath. The patch retains Eclipse classpath integrity. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/114/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/114/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/114/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/114/console This message is automatically generated.
          Hide
          Jothi Padmanabhan added a comment -

          +1. Patch looks good.

          Show
          Jothi Padmanabhan added a comment - +1. Patch looks good.
          Hide
          Devaraj Das added a comment -

          I just committed this. Thanks, Ravi!

          Show
          Devaraj Das added a comment - I just committed this. Thanks, Ravi!
          Hide
          Hudson added a comment -

          Integrated in Hadoop-trunk #790 (See http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/790/)
          . Solves a problem in the progress report of the reduce task. Contributed by Ravi Gummadi.

          Show
          Hudson added a comment - Integrated in Hadoop-trunk #790 (See http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/790/ ) . Solves a problem in the progress report of the reduce task. Contributed by Ravi Gummadi.
          Hide
          Ravi Gummadi added a comment -

          It would be nice if this patch can be committed to branch 0.20 also.
          The same patch applies to branch 0.20 also.
          Devaraj, Would you please commit this to 0.20 ?

          Show
          Ravi Gummadi added a comment - It would be nice if this patch can be committed to branch 0.20 also. The same patch applies to branch 0.20 also. Devaraj, Would you please commit this to 0.20 ?
          Hide
          Devaraj Das added a comment -

          I committed this to the 0.20 branch.

          Show
          Devaraj Das added a comment - I committed this to the 0.20 branch.

            People

            • Assignee:
              Ravi Gummadi
              Reporter:
              Jothi Padmanabhan
            • Votes:
              1 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development