[HDFS-395] DFS Scalability: Incremental block reports - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Sub-task
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 2.0.0-alpha, 0.23.7
Component/s: datanode, namenode
Labels:
None

Hadoop Flags:

Incompatible change

Description

I have a cluster that has 1800 datanodes. Each datanode has around 50000 blocks and sends a block report to the namenode once every hour. This means that the namenode processes a block report once every 2 seconds. Each block report contains all blocks that the datanode currently hosts. This makes the namenode compare a huge number of blocks that practically remains the same between two consecutive reports. This wastes CPU on the namenode.

The problem becomes worse when the number of datanodes increases.

One proposal is to make succeeding block reports (after a successful send of a full block report) be incremental. This will make the namenode process only those blocks that were added/deleted in the last period.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

blockReportPeriod.patch
16/May/07 23:02
3 kB
Dhruba Borthakur
explicitAcks.patch-3
28/Jul/11 00:41
31 kB
Tomasz Nykiel
explicitAcks.patch-4
23/Aug/11 22:54
30 kB
Tomasz Nykiel
explicitAcks.patch-5
24/Aug/11 03:32
30 kB
Tomasz Nykiel
explicitAcks.patch-6
25/Aug/11 14:58
31 kB
Tomasz Nykiel
explicitDeleteAcks.patch
01/Jul/11 20:20
20 kB
Tomasz Nykiel

Activity

People

Assignee:: Tomasz Nykiel

Reporter:: Dhruba Borthakur

Votes:: 0 Vote for this issue

Watchers:: 21 Start watching this issue

Dates

Created:: 07/Mar/07 17:58

Updated:: 28/Sep/15 20:58

Resolved:: 26/Aug/11 03:12