Details
-
Improvement
-
Status: In Progress
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
Description
When the cluster exceeds thousands of nodes, we want to restart the NameNode service, and all DataNodes send a full Block action to the NameNode. During SafeMode, some DataNodes may send blocks to NameNode multiple times, which will take up too much RPC. In fact, this is unnecessary.
In this case, some block report leases will fail or time out, and in extreme cases, the NameNode will always stay in Safe Mode.
2021-03-14 08:16:25,873 [78438700] - INFO [Block report processor:BlockManager@2158] - BLOCK* processReport 0xexxxxxxxx: discarded non-initial block report from DatanodeRegistration(xxxxxxxx:port, datanodeUuid=xxxxxxxx, infoPort=xxxxxxxx, infoSecurePort=xxxxxxxx, ipcPort=xxxxxxxx, storageInfo=lv=xxxxxxxx;nsid=xxxxxxxx;c=0) because namenode still in startup phase
2021-03-14 08:16:31,521 [78444348] - INFO [Block report processor:BlockManager@2158] - BLOCK* processReport 0xexxxxxxxx: discarded non-initial block report from DatanodeRegistration(xxxxxxxx, datanodeUuid=xxxxxxxx, infoPort=xxxxxxxx, infoSecurePort=xxxxxxxx, ipcPort=xxxxxxxx, storageInfo=lv=xxxxxxxx;nsid=xxxxxxxx;c=0) because namenode still in startup phase
2021-03-13 18:35:38,200 [29191027] - WARN [Block report processor:BlockReportLeaseManager@311] - BR lease 0xxxxxxxxx is not valid for DN xxxxxxxx, because the DN is not in the pending set.
2021-03-13 18:36:08,143 [29220970] - WARN [Block report processor:BlockReportLeaseManager@311] - BR lease 0xxxxxxxxx is not valid for DN xxxxxxxx, because the DN is not in the pending set.
2021-03-13 18:36:08,143 [29220970] - WARN [Block report processor:BlockReportLeaseManager@317] - BR lease 0xxxxxxxxx is not valid for DN xxxxxxxx, because the lease has expired.
2021-03-13 18:36:08,145 [29220972] - WARN [Block report processor:BlockReportLeaseManager@317] - BR lease 0xxxxxxxxx is not valid for DN xxxxxxxx, because the lease has expired.
Attachments
Issue Links
- Blocked
-
HDFS-17093 Fix block report lease issue to avoid missing some storages report.
- Resolved
- links to