Details

    • Type: Bug
    • Status: Resolved
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 2.0.0-alpha
    • Fix Version/s: 2.8.0, 2.9.0, 2.7.4, 3.0.0-alpha2
    • Component/s: ipc
    • Labels:
      None
    • Target Version/s:
    • Hadoop Flags:
      Reviewed

      Description

      Reader threads can die to a race condition with the responder thread. If the server's ipc handler cannot send a response in one write, it delegates sending the rest of the response to the responder thread.

      The race occurs when the responder thread has an exception writing to the socket. The responder closes the socket. This wakes up the reader polling on the socket. If a CancelledKeyException is thrown, which is a runtime exception, the reader dies. All connections serviced by that reader are now in limbo until the client possibly times out. New connections play roulette as to whether they are assigned to a defunct reader.

        Issue Links

          Activity

          Hide
          vinodkv Vinod Kumar Vavilapalli added a comment -

          Removing target-version off this long-standing issue, please add it back once there is a patch available for release. Tx.

          Show
          vinodkv Vinod Kumar Vavilapalli added a comment - Removing target-version off this long-standing issue, please add it back once there is a patch available for release. Tx.
          Hide
          daryn Daryn Sharp added a comment -

          Patch we've been using internally, but modified per HADOOP-13657 to terminate the process if reader encounters an unrecoverable runtime exception (ex. jdk bug). No test due to difficulty of instrumenting the failure mode.

          Show
          daryn Daryn Sharp added a comment - Patch we've been using internally, but modified per HADOOP-13657 to terminate the process if reader encounters an unrecoverable runtime exception (ex. jdk bug). No test due to difficulty of instrumenting the failure mode.
          Hide
          hadoopqa Hadoop QA added a comment -
          -1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 17s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          -1 test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch.
          +1 mvninstall 6m 52s trunk passed
          +1 compile 7m 2s trunk passed
          +1 checkstyle 0m 26s trunk passed
          +1 mvnsite 1m 1s trunk passed
          +1 mvneclipse 0m 11s trunk passed
          +1 findbugs 1m 22s trunk passed
          +1 javadoc 0m 46s trunk passed
          +1 mvninstall 0m 39s the patch passed
          +1 compile 7m 13s the patch passed
          +1 javac 7m 13s the patch passed
          +1 checkstyle 0m 30s hadoop-common-project/hadoop-common: The patch generated 0 new + 195 unchanged - 1 fixed = 195 total (was 196)
          +1 mvnsite 0m 55s the patch passed
          +1 mvneclipse 0m 12s the patch passed
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 findbugs 1m 28s the patch passed
          +1 javadoc 0m 45s the patch passed
          +1 unit 8m 14s hadoop-common in the patch passed.
          +1 asflicense 0m 21s The patch does not generate ASF License warnings.
          39m 43s



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:9560f25
          JIRA Issue HADOOP-11780
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12830505/HADOOP-11780.patch
          Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
          uname Linux 71f69fd0563c 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / ebf528c
          Default Java 1.8.0_101
          findbugs v3.0.0
          Test Results https://builds.apache.org/job/PreCommit-HADOOP-Build/10610/testReport/
          modules C: hadoop-common-project/hadoop-common U: hadoop-common-project/hadoop-common
          Console output https://builds.apache.org/job/PreCommit-HADOOP-Build/10610/console
          Powered by Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 17s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. -1 test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 mvninstall 6m 52s trunk passed +1 compile 7m 2s trunk passed +1 checkstyle 0m 26s trunk passed +1 mvnsite 1m 1s trunk passed +1 mvneclipse 0m 11s trunk passed +1 findbugs 1m 22s trunk passed +1 javadoc 0m 46s trunk passed +1 mvninstall 0m 39s the patch passed +1 compile 7m 13s the patch passed +1 javac 7m 13s the patch passed +1 checkstyle 0m 30s hadoop-common-project/hadoop-common: The patch generated 0 new + 195 unchanged - 1 fixed = 195 total (was 196) +1 mvnsite 0m 55s the patch passed +1 mvneclipse 0m 12s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 1m 28s the patch passed +1 javadoc 0m 45s the patch passed +1 unit 8m 14s hadoop-common in the patch passed. +1 asflicense 0m 21s The patch does not generate ASF License warnings. 39m 43s Subsystem Report/Notes Docker Image:yetus/hadoop:9560f25 JIRA Issue HADOOP-11780 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12830505/HADOOP-11780.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 71f69fd0563c 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / ebf528c Default Java 1.8.0_101 findbugs v3.0.0 Test Results https://builds.apache.org/job/PreCommit-HADOOP-Build/10610/testReport/ modules C: hadoop-common-project/hadoop-common U: hadoop-common-project/hadoop-common Console output https://builds.apache.org/job/PreCommit-HADOOP-Build/10610/console Powered by Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          kihwal Kihwal Lee added a comment -

          The patch looks good. We have been running with this for a long time. The only difference is the addition of jvm termination, which was done to take care of other cases reported HADOOP-13657. Zhe Zhang, are you okay with this patch?

          Show
          kihwal Kihwal Lee added a comment - The patch looks good. We have been running with this for a long time. The only difference is the addition of jvm termination, which was done to take care of other cases reported HADOOP-13657 . Zhe Zhang , are you okay with this patch?
          Hide
          shv Konstantin Shvachko added a comment -

          The patch looks good. Also fixes HADOOP-13657.
          +1 on behalf of Zhe Zhang

          Show
          shv Konstantin Shvachko added a comment - The patch looks good. Also fixes HADOOP-13657 . +1 on behalf of Zhe Zhang
          Hide
          hudson Hudson added a comment -

          SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #10507 (See https://builds.apache.org/job/Hadoop-trunk-Commit/10507/)
          HADOOP-11780. Prevent IPC reader thread death. Contributed by Daryn (kihwal: rev e19b37ead23805c7ed45bdcbfa7fdc8898cde7b2)

          • (edit) hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Server.java
          Show
          hudson Hudson added a comment - SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #10507 (See https://builds.apache.org/job/Hadoop-trunk-Commit/10507/ ) HADOOP-11780 . Prevent IPC reader thread death. Contributed by Daryn (kihwal: rev e19b37ead23805c7ed45bdcbfa7fdc8898cde7b2) (edit) hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Server.java
          Hide
          kihwal Kihwal Lee added a comment -

          Thanks for the review, Konstantin Shvachko. Committed this to trunk through branch-2.8. The patch needs a minor tweak for branch-2.7.

          Show
          kihwal Kihwal Lee added a comment - Thanks for the review, Konstantin Shvachko . Committed this to trunk through branch-2.8. The patch needs a minor tweak for branch-2.7.
          Hide
          kihwal Kihwal Lee added a comment -

          The 2.7 commit has been done. The only change is using type Call instead of RpcCall in one line. The multiple call type support is missing in 2.7. Everything else is identical except the type change.

          Show
          kihwal Kihwal Lee added a comment - The 2.7 commit has been done. The only change is using type Call instead of RpcCall in one line. The multiple call type support is missing in 2.7. Everything else is identical except the type change.
          Hide
          djp Junping Du added a comment -

          Adding back 2.8.0 and 2.9.0 in fix version.

          Show
          djp Junping Du added a comment - Adding back 2.8.0 and 2.9.0 in fix version.

            People

            • Assignee:
              daryn Daryn Sharp
              Reporter:
              daryn Daryn Sharp
            • Votes:
              0 Vote for this issue
              Watchers:
              20 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development