Hadoop Map/Reduce
  1. Hadoop Map/Reduce
  2. MAPREDUCE-2392

TaskTracker shutdown in the tests sometimes take 60s

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.22.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      There are a lot of the following in the test logs:

      2011-03-16 13:47:02,267 INFO  mapred.TaskTracker (TaskTracker.java:shutdown(1275)) - Shutting down StatusHttpServer
      2011-03-16 13:48:02,349 ERROR mapred.TaskTracker (TaskTracker.java:offerService(1609)) - Caught exception: java.io.IOException: Call to localhost/127.0.0.1:57512 failed on local exception: java.nio.channels.ClosedByInterruptException
      

      Note there is over one minute between the first line and the second.

        Issue Links

          Activity

          Hide
          Tom White added a comment -

          This is reminiscent of HADOOP-5380, except in this case it is the TT-JT communication that is timing out.

          The problem is that calling interrupt() on the TaskTracker thread can (by chance) cause it to interrupt the heartbeat RPC call, which then takes 60 seconds to timeout ("ipc.ping.interval") from readInt() in org.apache.hadoop.ipc.Client$Connection.receiveResponse.

          This can be fixed by removing the interrupt call to the TaskTracker, since we already call shutdown() on the TaskTracker then join() on the thread running it.

          Show
          Tom White added a comment - This is reminiscent of HADOOP-5380 , except in this case it is the TT-JT communication that is timing out. The problem is that calling interrupt() on the TaskTracker thread can (by chance) cause it to interrupt the heartbeat RPC call, which then takes 60 seconds to timeout ("ipc.ping.interval") from readInt() in org.apache.hadoop.ipc.Client$Connection.receiveResponse. This can be fixed by removing the interrupt call to the TaskTracker, since we already call shutdown() on the TaskTracker then join() on the thread running it.
          Hide
          Tom White added a comment -

          This might be fixed by HADOOP-6762. Even if it does, we might still use this patch since there is no need to call shutdown() and interrupt() from tests.

          Show
          Tom White added a comment - This might be fixed by HADOOP-6762 . Even if it does, we might still use this patch since there is no need to call shutdown() and interrupt() from tests.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12473882/MAPREDUCE-2392.patch
          against trunk revision 1082400.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 6 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

          +1 release audit. The applied patch does not increase the total number of release audit warnings.

          +1 core tests. The patch passed core unit tests.

          -1 contrib tests. The patch failed contrib unit tests.

          +1 system test framework. The patch passed system test framework compile.

          Test results: https://hudson.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/139//testReport/
          Findbugs warnings: https://hudson.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/139//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Console output: https://hudson.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/139//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12473882/MAPREDUCE-2392.patch against trunk revision 1082400. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. -1 contrib tests. The patch failed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://hudson.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/139//testReport/ Findbugs warnings: https://hudson.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/139//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://hudson.apache.org/hudson/job/PreCommit-MAPREDUCE-Build/139//console This message is automatically generated.
          Hide
          Todd Lipcon added a comment -

          seems reasonable to me. +1

          Show
          Todd Lipcon added a comment - seems reasonable to me. +1
          Hide
          Tom White added a comment -

          With this patch Jenkins ran the tests in 3 hours and 45 minutes. Previous runs have been upwards of 5 hours.

          Show
          Tom White added a comment - With this patch Jenkins ran the tests in 3 hours and 45 minutes. Previous runs have been upwards of 5 hours.
          Hide
          Tom White added a comment -

          I've just committed this.

          Show
          Tom White added a comment - I've just committed this.
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Mapreduce-trunk-Commit #635 (See https://hudson.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/635/)
          MAPREDUCE-2392. TaskTracker shutdown in the tests sometimes take 60s.

          Show
          Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk-Commit #635 (See https://hudson.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/635/ ) MAPREDUCE-2392 . TaskTracker shutdown in the tests sometimes take 60s.
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Mapreduce-22-branch #38 (See https://hudson.apache.org/hudson/job/Hadoop-Mapreduce-22-branch/38/)
          Merge -r 1082702:1082703 from trunk to branch-0.22. Fixes: MAPREDUCE-2392

          Show
          Hudson added a comment - Integrated in Hadoop-Mapreduce-22-branch #38 (See https://hudson.apache.org/hudson/job/Hadoop-Mapreduce-22-branch/38/ ) Merge -r 1082702:1082703 from trunk to branch-0.22. Fixes: MAPREDUCE-2392
          Hide
          Hudson added a comment -

          Integrated in Hadoop-Mapreduce-trunk #643 (See https://hudson.apache.org/hudson/job/Hadoop-Mapreduce-trunk/643/)

          Show
          Hudson added a comment - Integrated in Hadoop-Mapreduce-trunk #643 (See https://hudson.apache.org/hudson/job/Hadoop-Mapreduce-trunk/643/ )

            People

            • Assignee:
              Tom White
              Reporter:
              Tom White
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development