Details
-
New Feature
-
Status: Closed
-
Major
-
Resolution: Fixed
-
0.17.2
-
None
-
None
-
Incompatible change, Reviewed
-
Modified dfsadmin -report to report under replicated blocks. blocks with corrupt replicas, and missing blocks".
Description
A whole bunch of datanodes became dead because of some network problems resulting in heartbeat timeouts although datanodes were fine.
Many processes started to fail because of the corrupted filesystem.
In order to catch and diagnose such problems faster the namenode should detect the corruption automatically and provide a way to alert operations. At the minimum it should show the fact of corruption on the GUI.