Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-5000

TaskImpl.getCounters() can return the counters for the wrong task attempt when task is speculating

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Critical Critical
    • Resolution: Fixed
    • Affects Version/s: 0.23.6
    • Fix Version/s: 0.23.7, 2.1.0-beta
    • Component/s: mr-am
    • Labels:
      None

      Description

      When a task is speculating and one attempt completes then sometimes the counters for the wrong attempt are aggregated into the total counters for the job. The scenario looks like this:

      1. Two task attempts are racing, _0 and _1
      2. _1 finishes first, causing the task to issue a TA_KILL to attempt _0
      3. _0 receives TA_KILL, sets progress to 1.0f and waits for container cleanup
      4. if TaskImpl.getCounters() is called now, TaskImpl.selectBestAttempt() can return _0 since it is not quite yet in the KILLED state yet progress is maxed out and no other attempt has more progress.
      1. MAPREDUCE-5000-branch-0.23.patch
        5 kB
        Jason Lowe
      2. MAPREDUCE-5000.patch
        5 kB
        Jason Lowe

        Activity

        Hide
        Hudson added a comment -

        Integrated in Hadoop-Mapreduce-trunk #1344 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1344/)
        MAPREDUCE-5000. Fixes getCounters when speculating by fixing the selection of the best attempt for a task. Contributed by Jason Lowe. (Revision 1445871)

        Result = SUCCESS
        sseth : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1445871
        Files :

        • /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
        • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskImpl.java
        • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TestTaskImpl.java
        Show
        Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk #1344 (See https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1344/ ) MAPREDUCE-5000 . Fixes getCounters when speculating by fixing the selection of the best attempt for a task. Contributed by Jason Lowe. (Revision 1445871) Result = SUCCESS sseth : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1445871 Files : /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskImpl.java /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TestTaskImpl.java
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Hdfs-trunk #1316 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1316/)
        MAPREDUCE-5000. Fixes getCounters when speculating by fixing the selection of the best attempt for a task. Contributed by Jason Lowe. (Revision 1445871)

        Result = FAILURE
        sseth : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1445871
        Files :

        • /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
        • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskImpl.java
        • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TestTaskImpl.java
        Show
        Hudson added a comment - Integrated in Hadoop-Hdfs-trunk #1316 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1316/ ) MAPREDUCE-5000 . Fixes getCounters when speculating by fixing the selection of the best attempt for a task. Contributed by Jason Lowe. (Revision 1445871) Result = FAILURE sseth : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1445871 Files : /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskImpl.java /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TestTaskImpl.java
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Hdfs-0.23-Build #525 (See https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/525/)
        MAPREDUCE-5000. Fixes getCounters when speculating by fixing the selection of the best attempt for a task. Contributed by Jason Lowe. (Revision 1445874)

        Result = SUCCESS
        sseth : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1445874
        Files :

        • /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt
        • /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskImpl.java
        • /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TestTaskImpl.java
        Show
        Hudson added a comment - Integrated in Hadoop-Hdfs-0.23-Build #525 (See https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/525/ ) MAPREDUCE-5000 . Fixes getCounters when speculating by fixing the selection of the best attempt for a task. Contributed by Jason Lowe. (Revision 1445874) Result = SUCCESS sseth : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1445874 Files : /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskImpl.java /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TestTaskImpl.java
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Yarn-trunk #127 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/127/)
        MAPREDUCE-5000. Fixes getCounters when speculating by fixing the selection of the best attempt for a task. Contributed by Jason Lowe. (Revision 1445871)

        Result = SUCCESS
        sseth : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1445871
        Files :

        • /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
        • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskImpl.java
        • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TestTaskImpl.java
        Show
        Hudson added a comment - Integrated in Hadoop-Yarn-trunk #127 (See https://builds.apache.org/job/Hadoop-Yarn-trunk/127/ ) MAPREDUCE-5000 . Fixes getCounters when speculating by fixing the selection of the best attempt for a task. Contributed by Jason Lowe. (Revision 1445871) Result = SUCCESS sseth : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1445871 Files : /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskImpl.java /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TestTaskImpl.java
        Hide
        Hudson added a comment -

        Integrated in Hadoop-trunk-Commit #3354 (See https://builds.apache.org/job/Hadoop-trunk-Commit/3354/)
        MAPREDUCE-5000. Fixes getCounters when speculating by fixing the selection of the best attempt for a task. Contributed by Jason Lowe. (Revision 1445871)

        Result = SUCCESS
        sseth : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1445871
        Files :

        • /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
        • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskImpl.java
        • /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TestTaskImpl.java
        Show
        Hudson added a comment - Integrated in Hadoop-trunk-Commit #3354 (See https://builds.apache.org/job/Hadoop-trunk-Commit/3354/ ) MAPREDUCE-5000 . Fixes getCounters when speculating by fixing the selection of the best attempt for a task. Contributed by Jason Lowe. (Revision 1445871) Result = SUCCESS sseth : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1445871 Files : /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskImpl.java /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TestTaskImpl.java
        Hide
        Siddharth Seth added a comment -

        Thanks Jason. Committing...

        Show
        Siddharth Seth added a comment - Thanks Jason. Committing...
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12569115/MAPREDUCE-5000-branch-0.23.patch
        against trunk revision .

        -1 patch. The patch command could not apply the patch.

        Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3329//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12569115/MAPREDUCE-5000-branch-0.23.patch against trunk revision . -1 patch . The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3329//console This message is automatically generated.
        Hide
        Jason Lowe added a comment -

        Thanks for the review, Sidd. Here's the patch for branch-0.23.

        Show
        Jason Lowe added a comment - Thanks for the review, Sidd. Here's the patch for branch-0.23.
        Hide
        Siddharth Seth added a comment -

        +1. Looks good. Jason, could you please provide a branch-23 patch as well. Thanks

        Show
        Siddharth Seth added a comment - +1. Looks good. Jason, could you please provide a branch-23 patch as well. Thanks
        Hide
        Hadoop QA added a comment -

        +1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12568880/MAPREDUCE-5000.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 1 new or modified test files.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 eclipse:eclipse. The patch built with eclipse:eclipse.

        +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3324//testReport/
        Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3324//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - +1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12568880/MAPREDUCE-5000.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 1 new or modified test files. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . The javadoc tool did not generate any warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3324//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3324//console This message is automatically generated.
        Hide
        Jason Lowe added a comment -

        Patch to have selectBestAttempt use the successful attempt, if available.

        Show
        Jason Lowe added a comment - Patch to have selectBestAttempt use the successful attempt, if available.

          People

          • Assignee:
            Jason Lowe
            Reporter:
            Jason Lowe
          • Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development