[HADOOP-7472] RPC client should deal with the IP address changes - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Closed
Priority: Minor
Resolution: Fixed
Affects Version/s: 0.20.205.0
Fix Version/s: 0.20.205.0, 0.23.0
Component/s: ipc
Labels:
None

Hadoop Flags:

Reviewed

Description

The current RPC client implementation and the client-side callers assume that the hostname-address mappings of servers never change. The resolved address is stored in an immutable InetSocketAddress object above/outside RPC, and the reconnect logic in the RPC Connection implementation also trusts the resolved address that was passed down.

If the NN suffers a failure that requires migration, it may be started on a different node with a different IP address. In this case, even if the name-address mapping is updated in DNS, the cluster is stuck trying old address until the whole cluster is restarted.

The RPC client-side should detect this situation and exit or try to recover.

Updating ConnectionId within the Client implementation may get the system work for the moment, there always is a risk of the cached address:port become connectable again unintentionally. The real solution will be notifying upper layer of the address change so that they can re-resolve and retry or re-architecture the system as discussed in ~~HDFS-34~~.

For 0.20 lines, some type of compromise may be acceptable. For example, raise a custom exception for some well-defined high-impact upper layer to do re-resolve/retry, while other will have to restart. For TRUNK, the HA work will most likely determine what needs to be done. So this Jira won't cover the solutions for TRUNK.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

addr_change_dfs_0_20s.patch.txt
03/Aug/11 13:27
2 kB
Kihwal Lee
addr_change_dfs_0_20s-1.patch.txt
03/Aug/11 21:03
2 kB
Kihwal Lee
addr_change_dfs_0_20s-2.patch.txt
08/Aug/11 20:56
2 kB
Kihwal Lee
addr_change_dfs_trunk.patch.txt
02/Aug/11 22:47
3 kB
Kihwal Lee
addr_change_dfs_trunk-1.patch.txt
03/Aug/11 13:27
3 kB
Kihwal Lee
addr_change_dfs_trunk-2.patch.txt
03/Aug/11 21:03
2 kB
Kihwal Lee
addr_change_dfs_trunk-3.patch.txt
08/Aug/11 20:56
2 kB
Kihwal Lee
addr_change_dfs.patch.txt
22/Jul/11 22:07
20 kB
Kihwal Lee
addr_change_dfs-1.patch.txt
25/Jul/11 16:04
18 kB
Kihwal Lee
addr_change_dfs-2.patch.txt
01/Aug/11 21:49
3 kB
Kihwal Lee
addr_change_dfs-3.patch.txt
02/Aug/11 19:22
2 kB
Kihwal Lee

Issue Links

is related to

HDFS-34 The elephant should remember names, not numbers.

Resolved

relates to

HADOOP-7510 Tokens should use original hostname provided instead of ip

Closed

HADOOP-7492 Add automated test for the RPC IP addr change detection and reconnect feature

Open

Activity

People

Assignee:: Kihwal Lee

Reporter:: Kihwal Lee

Votes:: 0 Vote for this issue

Watchers:: 10 Start watching this issue

Dates

Created:: 18/Jul/11 18:56

Updated:: 19/Oct/11 00:26

Resolved:: 10/Aug/11 20:27