Hadoop HDFS
  1. Hadoop HDFS
  2. HDFS-107

Data-nodes should be formatted when the name-node is formatted.

    Details

    • Type: Bug Bug
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 0.23.0
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      The upgrade feature HADOOP-702 requires data-nodes to store persistently the namespaceID
      in their version files and verify during startup that it matches the one stored on the name-node.
      When the name-node reformats it generates a new namespaceID.
      Now if the cluster starts with the reformatted name-node, and not reformatted data-nodes
      the data-nodes will fail with
      java.io.IOException: Incompatible namespaceIDs ...

      Data-nodes should be reformatted whenever the name-node is. I see 2 approaches here:
      1) In order to reformat the cluster we call "start-dfs -format" or make a special script "format-dfs".
      This would format the cluster components all together. The question is whether it should start
      the cluster after formatting?
      2) Format the name-node only. When data-nodes connect to the name-node it will tell them to
      format their storage directories if it sees that the namespace is empty and its cTime=0.
      The drawback of this approach is that we can loose blocks of a data-node from another cluster
      if it connects by mistake to the empty name-node.

      1. HDFS-107-1.patch
        12 kB
        ramkrishna.s.vasudevan

        Activity

        No work has yet been logged on this issue.

          People

          • Assignee:
            Unassigned
            Reporter:
            Konstantin Shvachko
          • Votes:
            12 Vote for this issue
            Watchers:
            23 Start watching this issue

            Dates

            • Created:
              Updated:

              Development