Hadoop Common
  1. Hadoop Common
  2. HADOOP-5342

DataNodes do not start up because InconsistentFSStateException on just part of the disks in use

    Details

    • Type: Bug Bug
    • Status: Open
    • Priority: Critical Critical
    • Resolution: Unresolved
    • Affects Version/s: 0.18.2
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      After restarting a cluster (including rebooting) the dfs got corrupted because many DataNodes did not start up, running into the following exception:

      2009-02-26 22:33:53,774 ERROR org.apache.hadoop.dfs.DataNode: org.apache.hadoop.dfs.InconsistentFSStateException: Directory xxx is in an inconsistent state: version file in current directory is missing.
      at org.apache.hadoop.dfs.Storage$StorageDirectory.analyzeStorage(Storage.java:326)
      at org.apache.hadoop.dfs.DataStorage.recoverTransitionRead(DataStorage.java:105)
      at org.apache.hadoop.dfs.DataNode.startDataNode(DataNode.java:306)
      at org.apache.hadoop.dfs.DataNode.<init>(DataNode.java:223)
      at org.apache.hadoop.dfs.DataNode.makeInstance(DataNode.java:3030)
      at org.apache.hadoop.dfs.DataNode.instantiateDataNode(DataNode.java:2985)
      at org.apache.hadoop.dfs.DataNode.createDataNode(DataNode.java:2993)
      at org.apache.hadoop.dfs.DataNode.main(DataNode.java:3115)

      This happens when using multiple disks with at least one previously marked as read-only, such that the storage version became out-dated, but after reboot it was mounted read-write, resulting in the DataNode not starting because of out-dated version.

      This is a big headache. If a DataNode has multiple disks of which at least one has the correct storage version then out-dated versions should not bring down the DataNode.

        Activity

        Sameer Paranjpye made changes -
        Assignee Hairong Kuang [ hairong ]
        Priority Blocker [ 1 ] Critical [ 2 ]
        Christian Kunz made changes -
        Field Original Value New Value
        Summary DataNodes do not start up when a previous version has not been cleaned up DataNodes do not start up because InconsistentFSStateException on just part of the disks in use
        Christian Kunz created issue -

          People

          • Assignee:
            Hairong Kuang
            Reporter:
            Christian Kunz
          • Votes:
            2 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

            • Created:
              Updated:

              Development