Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-4659

Root cause of connection failure is being lost to code that uses it for delaying startup

VotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Blocker
    • Resolution: Fixed
    • 0.18.3
    • 0.18.3
    • ipc
    • None
    • Incompatible change, Reviewed

    Description

      ipc.Client the root cause of a connection failure is being lost as the exception is wrapped, hence the outside code, the one that looks for that root cause, isn't working as expected. The results is you can't bring up a task tracker before job tracker, and probably the same for a datanode before a namenode. The change that triggered this is not yet located, I had thought it was HADOOP-3844 but I no longer believe this is the case.

      Attachments

        1. hadoop-4659.patch
          1 kB
          Steve Loughran
        2. connectRetry.patch
          0.7 kB
          Hairong Kuang
        3. rpcConn.patch
          3 kB
          Hairong Kuang
        4. hadoop-4659.patch
          6 kB
          Steve Loughran
        5. rpcConn1.patch
          6 kB
          Hairong Kuang

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            stevel@apache.org Steve Loughran
            stevel@apache.org Steve Loughran
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment