Affects Version/s: 0.18.2
Fix Version/s: None
After restarting a cluster (including rebooting) the dfs got corrupted because many DataNodes did not start up, running into the following exception:
2009-02-26 22:33:53,774 ERROR org.apache.hadoop.dfs.DataNode: org.apache.hadoop.dfs.InconsistentFSStateException: Directory xxx is in an inconsistent state: version file in current directory is missing.
This happens when using multiple disks with at least one previously marked as read-only, such that the storage version became out-dated, but after reboot it was mounted read-write, resulting in the DataNode not starting because of out-dated version.
This is a big headache. If a DataNode has multiple disks of which at least one has the correct storage version then out-dated versions should not bring down the DataNode.
|Field||Original Value||New Value|
|Summary||DataNodes do not start up when a previous version has not been cleaned up||DataNodes do not start up because InconsistentFSStateException on just part of the disks in use|
|Assignee||Hairong Kuang [ hairong ]|
|Priority||Blocker [ 1 ]||Critical [ 2 ]|