Details
-
Sub-task
-
Status: Open
-
Major
-
Resolution: Unresolved
-
3.0.0-alpha1
-
None
-
None
-
None
Description
In rare instances the namenode fails to load editlog due to corruption during startup. This has more severe impact if editlog segment to be checkpointed has corruption, as checkpointing fails because the editlog with corruption cannot be consumed. If an administrator does not notice this and address it by saving the namespace, recovering the namespace would involve complex file editing, using previous backups or losing last set of modifications.
The other issue that also happens frequently is, checkpointing fails and has not happened for a long time, resulting in long editlogs and even corrupt editlogs.
To handle these issues, when namenode is stopped, we can put it in safemode and save the namespace, before the process is shutdown. As an added benefit, the namenode restart would be faster, given there is no editlog to consume.
What do folks think?
Attachments
Issue Links
- is related to
-
HDFS-3651 optionally, the NameNode should invoke saveNamespace after getting a SIGTERM
-
- Open
-