[HDFS-5757] refreshNodes with many nodes at the same time could slow down NN - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Major
Resolution: Duplicate
Affects Version/s: None
Fix Version/s: None
Component/s: namenode
Labels:
None

Description

Sometimes we need to decomm a whole rack of nodes at the same time. When the decomm is in process; NN is slow.

The reason is when DecommissionManager checks the decomm status, it acquires namesystem's writer lock and iterates through all DNs; for each DN that is in decommissioning state, it check if replication is done for all the blocks on the machine via blockManager.isReplicationInProgress; for large cluster; the number of blocks on the machine could be big.

The fix could be to have DecommissionManager check for several decomm-in-progress nodes each time it aquires namesystem's writer lock.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

HDFS-5757.patch
06/Dec/14 02:46
6 kB
Ming Ma

Issue Links

duplicates

HDFS-7411 Refactor and improve decommissioning logic into DecommissionManager

Closed

Activity

People

Assignee:: Ming Ma

Reporter:: Ming Ma

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 10/Jan/14 19:50

Updated:: 04/May/15 18:42

Resolved:: 04/May/15 18:42