Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
Description
I have a 2000 node cluster and the block report interval is set to 1 hour. Most block report arrive within a few minutes of one another. For example, I have seen block reports from all 2000 nodes arrive within 5 minutes of one another. This causes CPU overload on the namenode, causing dropped calls in Call queue.
My proposal is to make the datanode send a block report as soon as the datanode starts. Then, it waits for a random time between 0 to 1 hour (the configured value) before sending the nect block report. From then on, block reports from that datanode are sent once every 1 hour (the configured value).