Hadoop Common
  1. Hadoop Common
  2. HADOOP-9655

Connection object in IPC Client can not run concurrently during connection time out

    Details

    • Type: Bug Bug
    • Status: Patch Available
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 2.0.4-alpha
    • Fix Version/s: None
    • Component/s: ipc
    • Labels:

      Description

      When one machine power off during running a job ,MRAppMaster find tasks timed out on that host and then call stop container for each container concurrently.
      But the IPC layer did it serially, for each call,the connection time out exception toke a few minutes to raise after 45 times reties. And AM hang for many hours to wait for stopContainer to finish.
      The jstack output file shows that most threads stuck at Connection.addCall waiting for a lock object hold by Connection.setupIOstreams.
      (The setupIOstreams method run slowlly becauseof connection time out during setupconnection.)

        Activity

        Hide
        Nemon Lou added a comment -

        This patch use a different object for wait and notify ,so one thread invoking addCall method won't be blocked by another thread calling setupConnection method.

        Show
        Nemon Lou added a comment - This patch use a different object for wait and notify ,so one thread invoking addCall method won't be blocked by another thread calling setupConnection method.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12591789/HADOOP-9655.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no new tests are needed for this patch.
        Also please list what manual steps were performed to verify this patch.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 eclipse:eclipse. The patch built with eclipse:eclipse.

        +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed unit tests in hadoop-common-project/hadoop-common.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/2767//testReport/
        Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/2767//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12591789/HADOOP-9655.patch against trunk revision . +1 @author . The patch does not contain any @author tags. -1 tests included . The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac . The applied patch does not increase the total number of javac compiler warnings. +1 javadoc . The javadoc tool did not generate any warning messages. +1 eclipse:eclipse . The patch built with eclipse:eclipse. +1 findbugs . The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit . The applied patch does not increase the total number of release audit warnings. +1 core tests . The patch passed unit tests in hadoop-common-project/hadoop-common. +1 contrib tests . The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/2767//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/2767//console This message is automatically generated.
        Hide
        Nemon Lou added a comment -

        This patch has been tested on my cluster and has solved the problem.

        Show
        Nemon Lou added a comment - This patch has been tested on my cluster and has solved the problem.
        Hide
        Hadoop QA added a comment -



        -1 overall



        Vote Subsystem Runtime Comment
        0 pre-patch 14m 31s Pre-patch trunk compilation is healthy.
        +1 @author 0m 0s The patch does not contain any @author tags.
        -1 tests included 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
        +1 javac 7m 29s There were no new javac warning messages.
        +1 javadoc 9m 31s There were no new javadoc warning messages.
        +1 release audit 0m 22s The applied patch does not increase the total number of release audit warnings.
        -1 checkstyle 1m 3s The applied patch generated 7 new checkstyle issues (total was 106, now 113).
        +1 whitespace 0m 0s The patch has no lines that end in whitespace.
        +1 install 1m 35s mvn install still works.
        +1 eclipse:eclipse 0m 33s The patch built with eclipse:eclipse.
        +1 findbugs 1m 40s The patch does not introduce any new Findbugs (version 2.0.3) warnings.
        +1 common tests 23m 11s Tests passed in hadoop-common.
            60m 0s  



        Subsystem Report/Notes
        Patch URL http://issues.apache.org/jira/secure/attachment/12591789/HADOOP-9655.patch
        Optional Tests javadoc javac unit findbugs checkstyle
        git revision trunk / 6ae2a0d
        checkstyle https://builds.apache.org/job/PreCommit-HADOOP-Build/6400/artifact/patchprocess/diffcheckstylehadoop-common.txt
        hadoop-common test log https://builds.apache.org/job/PreCommit-HADOOP-Build/6400/artifact/patchprocess/testrun_hadoop-common.txt
        Test Results https://builds.apache.org/job/PreCommit-HADOOP-Build/6400/testReport/
        Java 1.7.0_55
        uname Linux asf902.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
        Console output https://builds.apache.org/job/PreCommit-HADOOP-Build/6400/console

        This message was automatically generated.

        Show
        Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 pre-patch 14m 31s Pre-patch trunk compilation is healthy. +1 @author 0m 0s The patch does not contain any @author tags. -1 tests included 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac 7m 29s There were no new javac warning messages. +1 javadoc 9m 31s There were no new javadoc warning messages. +1 release audit 0m 22s The applied patch does not increase the total number of release audit warnings. -1 checkstyle 1m 3s The applied patch generated 7 new checkstyle issues (total was 106, now 113). +1 whitespace 0m 0s The patch has no lines that end in whitespace. +1 install 1m 35s mvn install still works. +1 eclipse:eclipse 0m 33s The patch built with eclipse:eclipse. +1 findbugs 1m 40s The patch does not introduce any new Findbugs (version 2.0.3) warnings. +1 common tests 23m 11s Tests passed in hadoop-common.     60m 0s   Subsystem Report/Notes Patch URL http://issues.apache.org/jira/secure/attachment/12591789/HADOOP-9655.patch Optional Tests javadoc javac unit findbugs checkstyle git revision trunk / 6ae2a0d checkstyle https://builds.apache.org/job/PreCommit-HADOOP-Build/6400/artifact/patchprocess/diffcheckstylehadoop-common.txt hadoop-common test log https://builds.apache.org/job/PreCommit-HADOOP-Build/6400/artifact/patchprocess/testrun_hadoop-common.txt Test Results https://builds.apache.org/job/PreCommit-HADOOP-Build/6400/testReport/ Java 1.7.0_55 uname Linux asf902.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Console output https://builds.apache.org/job/PreCommit-HADOOP-Build/6400/console This message was automatically generated.

          People

          • Assignee:
            Unassigned
            Reporter:
            Nemon Lou
          • Votes:
            0 Vote for this issue
            Watchers:
            9 Start watching this issue

            Dates

            • Created:
              Updated:

              Development