[HDFS-107] Data-nodes should be formatted when the name-node is formatted. - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: 0.23.0
Fix Version/s: None
Component/s: None
Labels:
None

Description

The upgrade feature ~~HADOOP-702~~ requires data-nodes to store persistently the namespaceID
in their version files and verify during startup that it matches the one stored on the name-node.
When the name-node reformats it generates a new namespaceID.
Now if the cluster starts with the reformatted name-node, and not reformatted data-nodes
the data-nodes will fail with
java.io.IOException: Incompatible namespaceIDs ...

Data-nodes should be reformatted whenever the name-node is. I see 2 approaches here:
1) In order to reformat the cluster we call "start-dfs -format" or make a special script "format-dfs".
This would format the cluster components all together. The question is whether it should start
the cluster after formatting?
2) Format the name-node only. When data-nodes connect to the name-node it will tell them to
format their storage directories if it sees that the namespace is empty and its cTime=0.
The drawback of this approach is that we can loose blocks of a data-node from another cluster
if it connects by mistake to the empty name-node.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

HDFS-107-1.patch
10/Jun/11 12:04
12 kB
ramkrishna.s.vasudevan

Activity

People

Assignee:: Unassigned

Reporter:: Konstantin Shvachko

Votes:: 12 Vote for this issue

Watchers:: 23 Start watching this issue

Dates

Created:: 05/Apr/07 20:49

Updated:: 10/Mar/12 18:43