Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-7060

Avoid taking locks when sending heartbeats from the DataNode

VotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Reviewed

    Description

      We're seeing the heartbeat is blocked by the monitor of FsDatasetImpl when the DN is under heavy load of writes:

         java.lang.Thread.State: BLOCKED (on object monitor)
              at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.getDfsUsed(FsVolumeImpl.java:115)
              - waiting to lock <0x0000000780304fb8> (a org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl)
              at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.getStorageReports(FsDatasetImpl.java:91)
              - locked <0x0000000780612fd8> (a java.lang.Object)
              at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.sendHeartBeat(BPServiceActor.java:563)
              at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:668)
              at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:827)
              at java.lang.Thread.run(Thread.java:744)
      
         java.lang.Thread.State: BLOCKED (on object monitor)
              at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createRbw(FsDatasetImpl.java:743)
              - waiting to lock <0x0000000780304fb8> (a org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl)
              at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createRbw(FsDatasetImpl.java:60)
              at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.<init>(BlockReceiver.java:169)
              at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:621)
              at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:124)
              at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:71)
              at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:232)
              at java.lang.Thread.run(Thread.java:744)
      
         java.lang.Thread.State: RUNNABLE
              at java.io.UnixFileSystem.createFileExclusively(Native Method)
              at java.io.File.createNewFile(File.java:1006)
              at org.apache.hadoop.hdfs.server.datanode.DatanodeUtil.createTmpFile(DatanodeUtil.java:59)
              at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.createRbwFile(BlockPoolSlice.java:244)
              at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.createRbwFile(FsVolumeImpl.java:195)
              at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createRbw(FsDatasetImpl.java:753)
              - locked <0x0000000780304fb8> (a org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl)
              at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createRbw(FsDatasetImpl.java:60)
              at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.<init>(BlockReceiver.java:169)
              at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:621)
              at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:124)
              at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:71)
              at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:232)
              at java.lang.Thread.run(Thread.java:744)
      

      Attachments

        1. complete_failed_qps.png
          42 kB
          Jiandan Yang
        2. HDFS-7060.000.patch
          4 kB
          Haohui Mai
        3. HDFS-7060.001.patch
          5 kB
          Xinwei Qin
        4. HDFS-7060.003.patch
          6 kB
          Jiandan Yang
        5. HDFS-7060.004.patch
          6 kB
          Weiwei Yang
        6. HDFS-7060.005.patch
          6 kB
          Weiwei Yang
        7. HDFS-7060-002.patch
          4 kB
          Brahma Reddy Battula
        8. HDFS Status Post Patch.png
          72 kB
          Weiwei Yang
        9. sendHeartbeat.png
          87 kB
          Jiandan Yang

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            yangjiandan Jiandan Yang
            wheat9 Haohui Mai
            Votes:
            0 Vote for this issue
            Watchers:
            38 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment