Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
Description
In an HA deployment, Clients are configured with the hostnames of both the Active and Standby Namenodes.Clients will first try one of the NNs (non-deterministically) and if its a standby NN, then it will respond to the client to retry the request on the other Namenode.
If the client happens to talks to the Standby first, and the standby is undergoing some GC / is busy, then those clients might not get a response soon enough to try the other NN.
Proposed Approach to solve this :
1) Use hedged RPCs to simultaneously call multiple configured NNs to decide which is the active Namenode.
2) Subsequent calls, will invoke the previously successful NN.
3) On failover of the currently active NN, the remaining NNs will be invoked to decide which is the new active
Attachments
Attachments
Issue Links
- is related to
-
HADOOP-11824 HDFS client and YARN should check for elected master with Zookeeper
- Open
-
HDFS-9797 Log Standby exceptions thrown by RequestHedgingProxyProvider at DEBUG Level
- Resolved
- relates to
-
HDFS-13388 RequestHedgingProxyProvider calls multiple configured NNs all the time
- Resolved
-
HDFS-10329 Bad initialisation of StringBuffer in RequestHedgingProxyProvider.java
- Resolved
-
HDFS-9836 RequestHedgingInvocationHandler can't be cast to org.apache.hadoop.ipc.RpcInvocationHandler
- Resolved