Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-717

Fix some corner case issues in speculative execution (post hadoop-2141)

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.21.0
    • Fix Version/s: 0.21.0
    • Component/s: jobtracker
    • Labels:
      None
    • Release Note:
      Fixes some edge cases while using speculative execution

      Description

      Some corner case issues can be fixed:
      1) Setup task should not add anything to the job statistics (since they are really fast and might affect the statistics of a job with few tasks)
      2) The statistics computations should be guarded for cases where things like sumOfSquares could become less than zero (due to rounding errors mostly).
      3) The method TaskInProgress.getCurrentProgressRate() should take into account the COMMIT_PENDING state
      4) The testcase TestSpeculativeExecution.testTaskLATEScheduling could be made more robust

      1. 717.patch
        5 kB
        Devaraj Das
      2. 717.patch
        16 kB
        Devaraj Das

        Activity

        Hide
        Devaraj Das added a comment -

        Patch with the fixes.

        Show
        Devaraj Das added a comment - Patch with the fixes.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12412839/717.patch
        against trunk revision 792613.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 3 new or modified tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed core unit tests.

        -1 contrib tests. The patch failed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/368/testReport/
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/368/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/368/artifact/trunk/build/test/checkstyle-errors.html
        Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/368/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12412839/717.patch against trunk revision 792613. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/368/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/368/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/368/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/368/console This message is automatically generated.
        Hide
        Devaraj Das added a comment -

        Apart from what I mentioned in the jira description, here are a few other fixes:
        1) Changed the speculative cap to take into account the tasktype (map and reduce). In the HADOOP-2141 patch, it is a global count for maps and reduces. The problem I encountered was that in some cases, the reduce tasks would take up the slots towards the beginning of the job, and the last few maps wouldn't get any speculative slot.
        2) Changed the dispatch time for a task to be per task-attempt based. In the HADOOP-2141 patch, it is per task based.

        All the unit tests & test-patch passed.

        Show
        Devaraj Das added a comment - Apart from what I mentioned in the jira description, here are a few other fixes: 1) Changed the speculative cap to take into account the tasktype (map and reduce). In the HADOOP-2141 patch, it is a global count for maps and reduces. The problem I encountered was that in some cases, the reduce tasks would take up the slots towards the beginning of the job, and the last few maps wouldn't get any speculative slot. 2) Changed the dispatch time for a task to be per task-attempt based. In the HADOOP-2141 patch, it is per task based. All the unit tests & test-patch passed.
        Hide
        Devaraj Das added a comment -

        Also disabled speculation for recovered jobs (jobs that are recovered after a JT restart). That part needs some more thought...

        Show
        Devaraj Das added a comment - Also disabled speculation for recovered jobs (jobs that are recovered after a JT restart). That part needs some more thought...
        Hide
        Amareshwari Sriramadasu added a comment -

        changes look fine to me

        Show
        Amareshwari Sriramadasu added a comment - changes look fine to me
        Hide
        Devaraj Das added a comment -

        I just committed this.

        Show
        Devaraj Das added a comment - I just committed this.

          People

          • Assignee:
            Devaraj Das
            Reporter:
            Devaraj Das
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development