[HDFS-1295] Improve namenode restart times by short-circuiting the first block reports from datanodes - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Sub-task
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 0.23.0
Fix Version/s: Federation Branch, 0.23.0
Component/s: namenode
Labels:
None

Hadoop Flags:

Reviewed

Description

The namenode restart is dominated by the performance of processing block reports. On a 2000 node cluster with 90 million blocks, block report processing takes 30 to 40 minutes. The namenode "diffs" the contents of the incoming block report with the contents of the blocks map, and then applies these diffs to the blocksMap, but in reality there is no need to compute the "diff" because this is the first block report from the datanode.

This code change improves block report processing time by 300%.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

HDFS-1295_delta_for_trunk.patch
08/Jun/11 22:21
2 kB
Matthew Foley
HDFS-1295_for_ymerge_v2.patch
10/Jun/11 01:43
33 kB
Matthew Foley
HDFS-1295_for_ymerge.patch
09/Jun/11 19:54
32 kB
Matthew Foley
IBR_shortcut_v2a.patch
08/Mar/11 03:04
29 kB
Matthew Foley
IBR_shortcut_v3atrunk.patch
02/Apr/11 01:13
28 kB
Matthew Foley
IBR_shortcut_v4atrunk.patch
11/Apr/11 22:26
49 kB
Matthew Foley
IBR_shortcut_v4atrunk.patch
11/Apr/11 20:26
49 kB
Matthew Foley
IBR_shortcut_v4atrunk.patch
11/Apr/11 06:59
49 kB
Matthew Foley
IBR_shortcut_v6atrunk.patch
18/Apr/11 22:01
49 kB
Matthew Foley
IBR_shortcut_v7atrunk.patch
21/Apr/11 08:18
33 kB
Matthew Foley
shortCircuitBlockReport_1.txt
12/Jul/10 23:55
10 kB
Dhruba Borthakur

Issue Links

is duplicated by

HDFS-1147 Reduce NN startup time by reducing the processing time of block reports

Resolved

Activity

People

Assignee:: Matthew Foley

Reporter:: Dhruba Borthakur

Votes:: 0 Vote for this issue

Watchers:: 14 Start watching this issue

Dates

Created:: 12/Jul/10 20:43

Updated:: 05/Jul/11 19:41

Resolved:: 10/Jun/11 21:41