Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-16293

Client sleeps and holds 'dataQueue' when DataNodes are congested

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.2.2, 3.3.1, 3.2.3
    • 3.4.0, 3.3.2
    • hdfs-client
    • None

    Description

      When I open the ECN and use Terasort(500G data,8 DataNodes,76 vcores/DN) for testing, DataNodes are congested(HDFS-8008). The client enters the sleep state after receiving the ACK for many times, but does not release the 'dataQueue'. The ResponseProcessor thread needs the 'dataQueue' to execute 'ackQueue.getFirst()', so the ResponseProcessor will wait for the client to release the 'dataQueue', which is equivalent to that the ResponseProcessor thread also enters sleep, resulting in ACK delay.MapReduce tasks can be delayed by tens of minutes or even hours.

      The DataStreamer thread can first execute 'one = dataQueue. getFirst()', release 'dataQueue', and then judge whether to execute 'backOffIfNecessary()' according to 'one.isHeartbeatPacket()'

       

      Attachments

        1. HDFS-16293.01.patch
          1 kB
          Yuanxin Zhu
        2. HDFS-16293.01-branch-3.2.2.patch
          1 kB
          Yuanxin Zhu
        3. HDFS-16293.02.patch
          4 kB
          Yuanxin Zhu
        4. HDFS-16293.03.patch
          4 kB
          Yuanxin Zhu
        5. HDFS-16293.04.patch
          5 kB
          Yuanxin Zhu
        6. HDFS-16293.05.patch
          5 kB
          Yuanxin Zhu
        7. HDFS-16293.06.patch
          5 kB
          Yuanxin Zhu
        8. HDFS-16293.07.patch
          6 kB
          Yuanxin Zhu

        Issue Links

          Activity

            People

              Yuanxin Zhu Yuanxin Zhu
              Yuanxin Zhu Yuanxin Zhu
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - 24h
                  24h
                  Remaining:
                  Remaining Estimate - 24h
                  24h
                  Logged:
                  Time Spent - Not Specified
                  Not Specified