Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Duplicate
-
None
-
None
-
None
Description
When a namenode restarts, it gets heartbeats followed by a block reports from the datanodes. The block report processing is heavyweight and can take many many seconds, especially when the number of blocks in each datanode is large. This causes the namenode to not be able to process heartbeats in time, thus leading to "lost heartbeats from datanode". The namenode declares the datanode to be dead and the datanode has to re-send its register/block report all over again. This is a scalability problem.
Attachments
Issue Links
- is blocked by
-
HADOOP-6952 Support sending priority RPC
- Open
- is related to
-
HDFS-599 Improve Namenode robustness by prioritizing datanode heartbeats over client requests
- Closed
- relates to
-
HDFS-779 Automatic move to safe-mode when cluster size drops
- Open