Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
3.2.2, 3.3.1, 3.2.3
-
None
Description
When I open the ECN and use Terasort(500G data,8 DataNodes,76 vcores/DN) for testing, DataNodes are congested(HDFS-8008). The client enters the sleep state after receiving the ACK for many times, but does not release the 'dataQueue'. The ResponseProcessor thread needs the 'dataQueue' to execute 'ackQueue.getFirst()', so the ResponseProcessor will wait for the client to release the 'dataQueue', which is equivalent to that the ResponseProcessor thread also enters sleep, resulting in ACK delay.MapReduce tasks can be delayed by tens of minutes or even hours.
The DataStreamer thread can first execute 'one = dataQueue. getFirst()', release 'dataQueue', and then judge whether to execute 'backOffIfNecessary()' according to 'one.isHeartbeatPacket()'
Attachments
Attachments
Issue Links
- is caused by
-
HDFS-8008 Support client-side back off when the datanodes are congested
- Resolved