[HDFS-14186] blockreport storm slow down namenode restart seriously in large cluster - ASF JIRA

Add vote

Voters

Watch issue

Watchers

Create sub-task

Link

Clone

Update Comment Author

Replace String in Comment

Update Comment Visibility

Delete Comments

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Reopened
Priority: Major
Resolution: Unresolved
Affects Version/s: 2.7.1
Fix Version/s: None
Component/s: namenode
Labels:
None

Description

In the current implementation, the datanode sends blockreport immediately after register to namenode successfully when restart, and the blockreport storm will make namenode high load to process them. One result is some received RPC have to skip because queue time is timeout. If some datanodes' heartbeat RPC are continually skipped for long times (default is heartbeatExpireInterval=630s) it will be set DEAD, then datanode has to re-register and send blockreport again, aggravate blockreport storm and trap in a vicious circle, and slow down (more than one hour and even more) namenode startup seriously in a large (several thousands of datanodes) and busy cluster especially. Although there are many work to optimize namenode startup, the issue still exists.
I propose to postpone dead datanode check when namenode have finished startup.
Any comments and suggestions are welcome.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

HDFS-14186.001.patch
08/Jan/19 15:51
11 kB
Xiaoqiao He

Issue Links

Add Link

is related to

HADOOP-12173 NetworkTopology#add calls NetworkTopology#toString always

Resolved

Delete this link

HDFS-9198 Coalesce IBR processing in the NN

Closed

Delete this link

Sub-Tasks

Create Sub-Task

1.	Make queue size of BlockReportProcessingThread configurable	Resolved	Xiaoqiao He	Actions
2.	Avoid to process duplicate blockreport of datanode when namenode in startup safemode	Resolved	Xiaoqiao He	Actions
3.	Avoid block report retry and slow down namenode startup	Open	Xiaoqiao He	Actions
4.	Optimizing safemode leave mechanism	Open	Xiaoqiao He	Actions