Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-7472

RPC client should deal with the IP address changes

VotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • 0.20.205.0
    • 0.20.205.0, 0.23.0
    • ipc
    • None
    • Reviewed

    Description

      The current RPC client implementation and the client-side callers assume that the hostname-address mappings of servers never change. The resolved address is stored in an immutable InetSocketAddress object above/outside RPC, and the reconnect logic in the RPC Connection implementation also trusts the resolved address that was passed down.

      If the NN suffers a failure that requires migration, it may be started on a different node with a different IP address. In this case, even if the name-address mapping is updated in DNS, the cluster is stuck trying old address until the whole cluster is restarted.

      The RPC client-side should detect this situation and exit or try to recover.

      Updating ConnectionId within the Client implementation may get the system work for the moment, there always is a risk of the cached address:port become connectable again unintentionally. The real solution will be notifying upper layer of the address change so that they can re-resolve and retry or re-architecture the system as discussed in HDFS-34.

      For 0.20 lines, some type of compromise may be acceptable. For example, raise a custom exception for some well-defined high-impact upper layer to do re-resolve/retry, while other will have to restart. For TRUNK, the HA work will most likely determine what needs to be done. So this Jira won't cover the solutions for TRUNK.

      Attachments

        1. addr_change_dfs_0_20s.patch.txt
          2 kB
          Kihwal Lee
        2. addr_change_dfs_0_20s-1.patch.txt
          2 kB
          Kihwal Lee
        3. addr_change_dfs_0_20s-2.patch.txt
          2 kB
          Kihwal Lee
        4. addr_change_dfs_trunk.patch.txt
          3 kB
          Kihwal Lee
        5. addr_change_dfs_trunk-1.patch.txt
          3 kB
          Kihwal Lee
        6. addr_change_dfs_trunk-2.patch.txt
          2 kB
          Kihwal Lee
        7. addr_change_dfs_trunk-3.patch.txt
          2 kB
          Kihwal Lee
        8. addr_change_dfs.patch.txt
          20 kB
          Kihwal Lee
        9. addr_change_dfs-1.patch.txt
          18 kB
          Kihwal Lee
        10. addr_change_dfs-2.patch.txt
          3 kB
          Kihwal Lee
        11. addr_change_dfs-3.patch.txt
          2 kB
          Kihwal Lee

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            kihwal Kihwal Lee
            kihwal Kihwal Lee
            Votes:
            0 Vote for this issue
            Watchers:
            11 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Issue deployment