[HDFS-1729] Improve metrics for measuring NN startup costs. - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Closed
Priority: Major
Resolution: Invalid
Affects Version/s: None
Fix Version/s: 0.20.203.0
Component/s: namenode
Labels:
None

Description

Current logging and metrics are insufficient to diagnose latency problems in cluster startup. Add:
1. better logs in both Datanode and Namenode for Initial Block Report processing, to help distinguish between block
report processing problems and RPC/queuing problems;
2. new logs to measure cost of scanning all blocks for over/under/invalid replicas, which occurs in Namenode just
before exiting safe mode;
3. new logs to measure cost of processing the under/invalid replica queues (created by the above mentioned scan), which
occurs just after exiting safe mode, and is said to take 100% of CPU.

Attachments

Activity

People

Assignee:: Matthew Foley

Reporter:: Owen O'Malley

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 07/Mar/11 21:48

Updated:: 02/Sep/11 22:17

Resolved:: 21/Apr/11 00:21