Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
-
Reviewed
Description
sometimes due to disk or some other problems, datanode takes minutes or tens of minutes to generate a block report. It causes the datanode not able to send heartbeat to NameNode every 3 seconds. In the worst case, it makes NameNode to detect a lost heartbeat and wrongly decide that the datanode is dead.
It would be nice to have two threads instead. One thread is for scanning data directories and generating block report, and executes the requests sent by NameNode; Another thread is for sending heartbeats, block reports, and picking up the requests from NameNode. By having these two threads, the sending of heartbeats will not get delayed by any slow block report or slow execution of NameNode requests.
Attachments
Attachments
Issue Links
- is part of
-
HADOOP-4556 Block went missing
- Closed
- is related to
-
HDFS-770 SocketTimeoutException: timeout while waiting for channel to be ready for read
- Open
-
HDFS-2379 0.20: Allow block reports to proceed without holding FSDataset lock
- Closed
- relates to
-
HADOOP-3232 Datanodes time out
- Closed