Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-1635

ResourceEstimator does not work after MAPREDUCE-842

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.21.0
    • Fix Version/s: 0.21.0
    • Component/s: tasktracker
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    • Release Note:
      Fixed a bug related to resource estimation for disk-based scheduling by modifying TaskTracker to return correct map output size for the completed maps and -1 for other tasks or failures.

      Description

      MAPREDUCE-842 changed Child's mapred.local.dir to have attemptDir as the base local directory. Also assumption is that
      org.apache.hadoop.mapred.MapOutputFile always gets Child's mapred.local.dir.
      But, MapOuptutFile.getOutputFile() is called from TaskTracker's conf, which does not find the output file. Thus TaskTracker.tryToGetOutputSize() always returns -1.

      1. patch-1635-1.txt
        9 kB
        Amareshwari Sriramadasu
      2. patch-1635.txt
        8 kB
        Amareshwari Sriramadasu
      3. ASF.LICENSE.NOT.GRANTED--patch-1635-ydist.txt
        9 kB
        Amareshwari Sriramadasu

        Issue Links

          Activity

          Hide
          Amareshwari Sriramadasu added a comment -

          I think the solution is to move the calculation of task output size to Task, instead of TaskTracker trying to construct the output file and failing. Task already has all the information of MapOutputFile. So, Task can set the output size in its last update, before sending umbilical.done().

          Attached patch does the above fix. I added a MiniMR test to test task output sizes for map-only job, map-reduce job and a failed job.

          In trunk, the log saying " reported output size..." in TaskTracker.TaskInProgress.reportDone() does not make sense, because setOutputSize() happens after the reportDone() call.
          But, with the attached patch it makes sense. I validated that the log prints proper value with patch.

          Patch removes following null checks in the code :

          -      Path tmp_output =  mapOutputFile.getOutputFile();
          -      if(tmp_output == null)
          -        return 0;
          -      FileSystem localFS = FileSystem.getLocal(conf);
          -      FileStatus stat = localFS.getFileStatus(tmp_output);
          -      if(stat == null)
          -        return 0;
          

          Because, mapOutputFile.getOutputFile() or localFS.getFileStatus(tmp_output) would never return null. Those calls either return proper value or throw an Exception. And the method handles Exception properly. Essentially these checks are unreachable code. Moreover, the return values deviate from the documentation that output size should be -1 if it can not be calculated.

          Also, TaskStatus.outputSize is initialized to -1 to take care of task failures.

          Show
          Amareshwari Sriramadasu added a comment - I think the solution is to move the calculation of task output size to Task, instead of TaskTracker trying to construct the output file and failing. Task already has all the information of MapOutputFile. So, Task can set the output size in its last update, before sending umbilical.done(). Attached patch does the above fix. I added a MiniMR test to test task output sizes for map-only job, map-reduce job and a failed job. In trunk, the log saying " reported output size..." in TaskTracker.TaskInProgress.reportDone() does not make sense, because setOutputSize() happens after the reportDone() call. But, with the attached patch it makes sense. I validated that the log prints proper value with patch. Patch removes following null checks in the code : - Path tmp_output = mapOutputFile.getOutputFile(); - if (tmp_output == null ) - return 0; - FileSystem localFS = FileSystem.getLocal(conf); - FileStatus stat = localFS.getFileStatus(tmp_output); - if (stat == null ) - return 0; Because, mapOutputFile.getOutputFile() or localFS.getFileStatus(tmp_output) would never return null. Those calls either return proper value or throw an Exception. And the method handles Exception properly. Essentially these checks are unreachable code. Moreover, the return values deviate from the documentation that output size should be -1 if it can not be calculated. Also, TaskStatus.outputSize is initialized to -1 to take care of task failures.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12440046/patch-1635.txt
          against trunk revision 928104.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 6 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          -1 contrib tests. The patch failed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/67/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/67/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/67/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/67/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12440046/patch-1635.txt against trunk revision 928104. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/67/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/67/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/67/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/67/console This message is automatically generated.
          Hide
          Amareshwari Sriramadasu added a comment -

          -1 contrib tests.

          TestStreamingBadRecords failed because of ZipException, and is not related to the patch.

          stacktrace:

          java.io.IOException: java.util.zip.ZipException: ZIP_Read: error reading zip file
          	at org.apache.hadoop.streaming.StreamJob.go(StreamJob.java:166)
          	at org.apache.hadoop.streaming.TestStreamingBadRecords.testNarrowDown(TestStreamingBadRecords.java:226)
          
          Show
          Amareshwari Sriramadasu added a comment - -1 contrib tests. TestStreamingBadRecords failed because of ZipException, and is not related to the patch. stacktrace: java.io.IOException: java.util.zip.ZipException: ZIP_Read: error reading zip file at org.apache.hadoop.streaming.StreamJob.go(StreamJob.java:166) at org.apache.hadoop.streaming.TestStreamingBadRecords.testNarrowDown(TestStreamingBadRecords.java:226)
          Hide
          Amareshwari Sriramadasu added a comment -

          TestStreamingBadRecords failed because of ZipException, and is not related to the patch.

          The same test passed on my machine.

          Added JUnit test fails without the patch and passes with the patch.

          Show
          Amareshwari Sriramadasu added a comment - TestStreamingBadRecords failed because of ZipException, and is not related to the patch. The same test passed on my machine. Added JUnit test fails without the patch and passes with the patch.
          Hide
          Vinod Kumar Vavilapalli added a comment -

          Patch looks nice and good. Am going to commit this to trunk.

          Show
          Vinod Kumar Vavilapalli added a comment - Patch looks nice and good. Am going to commit this to trunk.
          Hide
          Vinod Kumar Vavilapalli added a comment -

          Argh.. one catch.. MapOutputFile is still being used in TaskTracker process space, as a member of TaskRunner class. This will still leave other bugs possible? Like in TaskRunner.prepare()?

          We should also slightly enhance MapOutputFile's javadoc to SHOUT ALOUD that it should not be used in TT address space.

          Show
          Vinod Kumar Vavilapalli added a comment - Argh.. one catch.. MapOutputFile is still being used in TaskTracker process space, as a member of TaskRunner class. This will still leave other bugs possible? Like in TaskRunner.prepare() ? We should also slightly enhance MapOutputFile's javadoc to SHOUT ALOUD that it should not be used in TT address space.
          Hide
          Ravi Gummadi added a comment -

          Had a quick look at the patch. Some comments:

          (1) getRunState() != TaskStatus.State.SUCCEEDED check is removed in calculateOutputSize(). I guess we need this check if some other caller also calls this method.
          (2) catch(Exception) can be changed to catch(IOException) in calculateOutputSize() ?

          Show
          Ravi Gummadi added a comment - Had a quick look at the patch. Some comments: (1) getRunState() != TaskStatus.State.SUCCEEDED check is removed in calculateOutputSize(). I guess we need this check if some other caller also calls this method. (2) catch(Exception) can be changed to catch(IOException) in calculateOutputSize() ?
          Hide
          Amareshwari Sriramadasu added a comment -

          MapOutputFile is still being used in TaskTracker process space, as a member of TaskRunner class.

          I raised MAPREDUCE-1662 for this.

          getRunState() != TaskStatus.State.SUCCEEDED check is removed in calculateOutputSize(). I guess we need this check if some other caller also calls this method

          This check is not needed because calaculateOutputSize is called from done(). If any other caller calls it, it will return -1 because the file is not there. We cannot add the check, since SUCCEEDED state is not set by Task, it is set by TaskTracker whenever the Task reports done().

          Show
          Amareshwari Sriramadasu added a comment - MapOutputFile is still being used in TaskTracker process space, as a member of TaskRunner class. I raised MAPREDUCE-1662 for this. getRunState() != TaskStatus.State.SUCCEEDED check is removed in calculateOutputSize(). I guess we need this check if some other caller also calls this method This check is not needed because calaculateOutputSize is called from done(). If any other caller calls it, it will return -1 because the file is not there. We cannot add the check, since SUCCEEDED state is not set by Task, it is set by TaskTracker whenever the Task reports done().
          Hide
          Amareshwari Sriramadasu added a comment -

          Patch adds javadoc to MapOutputFile and changes catch(Exception) to catch(IOException) as suggested by Ravi.

          Show
          Amareshwari Sriramadasu added a comment - Patch adds javadoc to MapOutputFile and changes catch(Exception) to catch(IOException) as suggested by Ravi.
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12440447/patch-1635-1.txt
          against trunk revision 929712.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 6 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/86/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/86/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/86/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/86/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12440447/patch-1635-1.txt against trunk revision 929712. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/86/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/86/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/86/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/86/console This message is automatically generated.
          Hide
          Vinod Kumar Vavilapalli added a comment -

          +1. The patch looks good to me. Rerunning it through Hudson so I can commit it after it's blessings.

          Show
          Vinod Kumar Vavilapalli added a comment - +1. The patch looks good to me. Rerunning it through Hudson so I can commit it after it's blessings.
          Hide
          Amareshwari Sriramadasu added a comment -

          Patch for Yahoo! distribution.

          Show
          Amareshwari Sriramadasu added a comment - Patch for Yahoo! distribution.
          Hide
          Amareshwari Sriramadasu added a comment -

          For some reason, Hudson failed to connect to jira. I pasting the test output from console : http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/98/console

          [exec]
          [exec]
          [exec] +1 overall. Here are the results of testing the latest attachment
          [exec] http://issues.apache.org/jira/secure/attachment/12440447/patch-1635-1.txt
          [exec] against trunk revision 931743.
          [exec]
          [exec] +1 @author. The patch does not contain any @author tags.
          [exec]
          [exec] +1 tests included. The patch appears to include 6 new or modified tests.
          [exec]
          [exec] +1 javadoc. The javadoc tool did not generate any warning messages.
          [exec]
          [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings.
          [exec]
          [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings.
          [exec]
          [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings.
          [exec]
          [exec] +1 core tests. The patch passed core unit tests.
          [exec]
          [exec] +1 contrib tests. The patch passed contrib unit tests.
          [exec]
          [exec] Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/98/testReport/
          [exec] Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/98/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          [exec] Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/98/artifact/trunk/build/test/checkstyle-errors.html
          [exec] Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/98/console
          [exec]
          [exec] This message is automatically generated.
          [exec]
          [exec]
          [exec] ======================================================================
          [exec] ======================================================================
          [exec] Adding comment to Jira.
          [exec] ======================================================================
          [exec] ======================================================================
          [exec]
          [exec]
          [exec] Failed to connect to: http://issues.apache.org/jira/rpc/soap/jirasoapservice-v2?wsdl
          [exec] Failed to connect to: http://issues.apache.org/jira/rpc/soap/jirasoapservice-v2?wsdl
          [exec] Failed to connect to: http://issues.apache.org/jira/rpc/soap/jirasoapservice-v2?wsdl
          [

          Show
          Amareshwari Sriramadasu added a comment - For some reason, Hudson failed to connect to jira. I pasting the test output from console : http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/98/console [exec] [exec] [exec] +1 overall. Here are the results of testing the latest attachment [exec] http://issues.apache.org/jira/secure/attachment/12440447/patch-1635-1.txt [exec] against trunk revision 931743. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 6 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. [exec] [exec] +1 core tests. The patch passed core unit tests. [exec] [exec] +1 contrib tests. The patch passed contrib unit tests. [exec] [exec] Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/98/testReport/ [exec] Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/98/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html [exec] Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/98/artifact/trunk/build/test/checkstyle-errors.html [exec] Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h4.grid.sp2.yahoo.net/98/console [exec] [exec] This message is automatically generated. [exec] [exec] [exec] ====================================================================== [exec] ====================================================================== [exec] Adding comment to Jira. [exec] ====================================================================== [exec] ====================================================================== [exec] [exec] [exec] Failed to connect to: http://issues.apache.org/jira/rpc/soap/jirasoapservice-v2?wsdl [exec] Failed to connect to: http://issues.apache.org/jira/rpc/soap/jirasoapservice-v2?wsdl [exec] Failed to connect to: http://issues.apache.org/jira/rpc/soap/jirasoapservice-v2?wsdl [
          Hide
          Vinod Kumar Vavilapalli added a comment -

          I just committed this to trunk. Thanks Amareshwari!

          Show
          Vinod Kumar Vavilapalli added a comment - I just committed this to trunk. Thanks Amareshwari!
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Mapreduce-trunk #285 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/285/)
          MAPREDUCE-1635. ResourceEstimator does not work after MAPREDUCE-842. Contributed by Amareshwari Sriramadasu.

          Show
          Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk #285 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk/285/ ) MAPREDUCE-1635 . ResourceEstimator does not work after MAPREDUCE-842 . Contributed by Amareshwari Sriramadasu.

            People

            • Assignee:
              Amareshwari Sriramadasu
              Reporter:
              Amareshwari Sriramadasu
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development