[HDFS-4832] Namenode doesn't change the number of missing blocks in safemode when DNs rejoin or leave - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Critical
Resolution: Fixed
Affects Version/s: 0.23.7, 2.1.0-beta, 3.0.0-alpha1
Fix Version/s: 2.1.0-beta, 0.23.9
Component/s: None
Labels:
None

Hadoop Flags:

Reviewed
Release Note:
This change makes name node keep its internal replication queues and data node state updated in manual safe mode. This allows metrics and UI to present up-to-date information while in safe mode. The behavior during start-up safe mode is unchanged.

Description

Courtesy Karri VRK Reddy!

1. Namenode lost datanodes causing missing blocks
2. Namenode was put in safe mode
3. Datanode restarted on dead nodes
4. Waited for lots of time for the NN UI to reflect the recovered blocks.
5. Forced NN out of safe mode and suddenly, no more missing blocks anymore.

I was able to replicate this on 0.23 and trunk. I set dfs.namenode.heartbeat.recheck-interval to 1 and killed the DN to simulate "lost" datanode. The opposite case also has problems (i.e. Datanode failing when NN is in safemode, doesn't lead to a missing blocks message)

Without the NN updating this list of missing blocks, the grid admins will not know when to take the cluster out of safemode.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

HDFS-4832.patch
16/May/13 23:07
0.8 kB
Ravi Prakash
HDFS-4832.patch
23/May/13 20:09
7 kB
Ravi Prakash
HDFS-4832.patch
29/May/13 01:55
7 kB
Ravi Prakash
HDFS-4832.patch
29/May/13 19:21
7 kB
Ravi Prakash
HDFS-4832.branch-0.23.patch
07/Jun/13 06:09
7 kB
Ravi Prakash
HDFS-4832.patch
07/Jun/13 06:13
8 kB
Ravi Prakash
HDFS-4832.patch
07/Jun/13 14:55
8 kB
Ravi Prakash

Issue Links

is blocked by

HDFS-4862 SafeModeInfo.isManual() returns true when resources are low even if it wasn't entered into manually

Closed

is related to

HDFS-4867 metaSave NPEs when there are invalid blocks in repl queue.

Closed

Activity

People

Assignee:: Ravi Prakash

Reporter:: Ravi Prakash

Votes:: 0 Vote for this issue

Watchers:: 8 Start watching this issue

Dates

Created:: 16/May/13 20:47

Updated:: 12/May/16 18:17

Resolved:: 07/Jun/13 20:18