Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-7240 Scaling HDFS
  3. HDFS-12691

Ozone: Decrease interval time of SCMBlockDeletingService for improving the efficiency

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Later
    • HDFS-7240
    • None
    • ozone, performance
    • None

    Description

      After logging elapsed time of each block deletion task, found some places we can make some improvements after testing. The logging during testing:

      2017-10-20 17:02:55,168 [Datanode State Machine Thread - 0] INFO  Configuration.deprecation (Configuration.java:logDeprecation(1306)) - No unit for ozone.scm.heartbeat.interval.seconds(1) assuming SECONDS
      2017-10-20 17:02:56,169 [Datanode State Machine Thread - 0] INFO  Configuration.deprecation (Configuration.java:logDeprecation(1306)) - No unit for ozone.scm.heartbeat.interval.seconds(1) assuming SECONDS
      2017-10-20 17:02:56,451 [SCMBlockDeletingService#0] INFO  utils.BackgroundService (BackgroundService.java:run(99)) - Running background service : SCMBlockDeletingService
      2017-10-20 17:02:56,755 [KeyDeletingService#0] INFO  utils.BackgroundService (BackgroundService.java:run(99)) - Running background service : KeyDeletingService
      2017-10-20 17:02:56,758 [KeyDeletingService#1] INFO  ksm.KeyDeletingService (KeyDeletingService.java:call(99))      - Found 11 to-delete keys in KSM
      2017-10-20 17:02:56,775 [IPC Server handler 19 on 52342] INFO  scm.StorageContainerManager (StorageContainerManager.java:deleteKeyBlocks(870))      - SCM is informed by KSM to delete 11 blocks
      2017-10-20 17:02:57,182 [Datanode State Machine Thread - 0] INFO  Configuration.deprecation (Configuration.java:logDeprecation(1306)) - No unit for ozone.scm.heartbeat.interval.seconds(1) assuming SECONDS
      2017-10-20 17:02:57,640 [KeyDeletingService#1] INFO  ksm.KeyDeletingService (KeyDeletingService.java:call(125))      - Number of key deleted from KSM DB: 11, task elapsed time: 885ms
      2017-10-20 17:02:58,168 [Datanode State Machine Thread - 0] INFO  Configuration.deprecation (Configuration.java:logDeprecation(1306)) - No unit for ozone.scm.heartbeat.interval.seconds(1) assuming SECONDS
      2017-10-20 17:03:03,178 [Datanode State Machine Thread - 0] INFO  Configuration.deprecation (Configuration.java:logDeprecation(1306)) - No unit for ozone.scm.heartbeat.interval.seconds(1) assuming SECONDS
      ...
      2017-10-20 17:03:04,167 [Datanode State Machine Thread - 0] INFO  Configuration.deprecation (Configuration.java:logDeprecation(1306)) - No unit for ozone.scm.heartbeat.interval.seconds(1) assuming SECONDS
      2017-10-20 17:03:05,173 [Datanode State Machine Thread - 0] INFO  Configuration.deprecation (Configuration.java:logDeprecation(1306)) - No unit for ozone.scm.heartbeat.interval.seconds(1) assuming SECONDS
      2017-10-20 17:03:06,095 [BlockDeletingService#0] INFO  utils.BackgroundService (BackgroundService.java:run(99)) - Running background service : BlockDeletingService
      2017-10-20 17:03:06,095 [BlockDeletingService#0] INFO  background.BlockDeletingService (BlockDeletingService.java:getTasks(109))      - Plan to choose 10 containers for block deletion, actually returns 0 valid containers.
      2017-10-20 17:03:06,171 [Datanode State Machine Thread - 0] INFO  Configuration.deprecation (Configuration.java:logDeprecation(1306)) - No unit for ozone.scm.heartbeat.interval.seconds(1) assuming SECONDS
      ...
      2017-10-20 17:03:54,279 [Datanode State Machine Thread - 0] INFO  Configuration.deprecation (Configuration.java:logDeprecation(1306)) - No unit for ozone.scm.heartbeat.interval.seconds(1) assuming SECONDS
      2017-10-20 17:03:55,267 [Datanode State Machine Thread - 0] INFO  Configuration.deprecation (Configuration.java:logDeprecation(1306)) - No unit for ozone.scm.heartbeat.interval.seconds(1) assuming SECONDS
      2017-10-20 17:03:56,282 [Datanode State Machine Thread - 0] INFO  Configuration.deprecation (Configuration.java:logDeprecation(1306)) - No unit for ozone.scm.heartbeat.interval.seconds(1) assuming SECONDS
      2017-10-20 17:03:56,461 [SCMBlockDeletingService#0] INFO  utils.BackgroundService (BackgroundService.java:run(99)) - Running background service : SCMBlockDeletingService
      2017-10-20 17:03:56,467 [SCMBlockDeletingService#1] INFO  block.SCMBlockDeletingService (SCMBlockDeletingService.java:call(129))      - Totally added 11 delete blocks command for 1 datanodes, task elapsed time: 6ms
      2017-10-20 17:03:57,265 [Datanode State Machine Thread - 0] INFO  Configuration.deprecation (Configuration.java:logDeprecation(1306)) - No unit for ozone.scm.heartbeat.interval.seconds(1) assuming SECONDS
      2017-10-20 17:03:57,645 [KeyDeletingService#0] INFO  utils.BackgroundService (BackgroundService.java:run(99)) - Running background service : KeyDeletingService
      2017-10-20 17:03:58,278 [Datanode State Machine Thread - 0] INFO  Configuration.deprecation (Configuration.java:logDeprecation(1306)) - No unit for ozone.scm.heartbeat.interval.seconds(1) assuming SECONDS
      2017-10-20 17:03:58,522 [Command processor thread] INFO  commandhandler.DeleteBlocksCommandHandler (DeleteBlocksCommandHandler.java:handle(87))      - Start to delete container blocks, TXIDs=[1(0),2(0),3(0),4(0),5(0),6(0),7(0),8(0),9(0),10(0),11(0)], numOfContainers=1, numOfBlocks=11
      2017-10-20 17:03:58,851 [Command processor thread] INFO  commandhandler.DeleteBlocksCommandHandler (DeleteBlocksCommandHandler.java:deleteContainerBlocks(168))      - Transited Block 29352ed9-cee2-4e05-beee-50284ea300c2 to DELETING state in container 1bfa7356-b07c-4d25-97b2-e66eaff19584
      2017-10-20 17:03:58,879 [Command processor thread] INFO  commandhandler.DeleteBlocksCommandHandler (DeleteBlocksCommandHandler.java:deleteContainerBlocks(168))      - Transited Block 88903455-756f-4640-8deb-069a5b2266fe to DELETING state in container 1bfa7356-b07c-4d25-97b2-e66eaff19584
      2017-10-20 17:03:58,906 [Command processor thread] INFO  commandhandler.DeleteBlocksCommandHandler (DeleteBlocksCommandHandler.java:deleteContainerBlocks(168))      - Transited Block d2ec1e39-f768-4432-88a9-acc2c5d7a7f2 to DELETING state in container 1bfa7356-b07c-4d25-97b2-e66eaff19584
      2017-10-20 17:03:58,934 [Command processor thread] INFO  commandhandler.DeleteBlocksCommandHandler (DeleteBlocksCommandHandler.java:deleteContainerBlocks(168))      - Transited Block 2785467a-f05c-49a1-ac9a-fc2819f73eba to DELETING state in container 1bfa7356-b07c-4d25-97b2-e66eaff19584
      2017-10-20 17:03:58,962 [Command processor thread] INFO  commandhandler.DeleteBlocksCommandHandler (DeleteBlocksCommandHandler.java:deleteContainerBlocks(168))      - Transited Block c4b39df9-d350-4a02-92eb-8ab3719e70c7 to DELETING state in container 1bfa7356-b07c-4d25-97b2-e66eaff19584
      2017-10-20 17:03:58,990 [Command processor thread] INFO  commandhandler.DeleteBlocksCommandHandler (DeleteBlocksCommandHandler.java:deleteContainerBlocks(168))      - Transited Block 0e85a4d3-2c6a-4822-9b9e-947451a8f645 to DELETING state in container 1bfa7356-b07c-4d25-97b2-e66eaff19584
      2017-10-20 17:03:59,017 [Command processor thread] INFO  commandhandler.DeleteBlocksCommandHandler (DeleteBlocksCommandHandler.java:deleteContainerBlocks(168))      - Transited Block b022c3bb-059c-47ac-83ca-49ae281765fb to DELETING state in container 1bfa7356-b07c-4d25-97b2-e66eaff19584
      2017-10-20 17:03:59,045 [Command processor thread] INFO  commandhandler.DeleteBlocksCommandHandler (DeleteBlocksCommandHandler.java:deleteContainerBlocks(168))      - Transited Block 4b76df3d-8b23-46f4-bbc1-98847e7c82bd to DELETING state in container 1bfa7356-b07c-4d25-97b2-e66eaff19584
      2017-10-20 17:03:59,073 [Command processor thread] INFO  commandhandler.DeleteBlocksCommandHandler (DeleteBlocksCommandHandler.java:deleteContainerBlocks(168))      - Transited Block 6474e09e-4626-4352-89e5-2e11e6063203 to DELETING state in container 1bfa7356-b07c-4d25-97b2-e66eaff19584
      2017-10-20 17:03:59,101 [Command processor thread] INFO  commandhandler.DeleteBlocksCommandHandler (DeleteBlocksCommandHandler.java:deleteContainerBlocks(168))      - Transited Block 4ec42696-5388-4869-958a-714b81e8e5fb to DELETING state in container 1bfa7356-b07c-4d25-97b2-e66eaff19584
      2017-10-20 17:03:59,235 [Command processor thread] INFO  commandhandler.DeleteBlocksCommandHandler (DeleteBlocksCommandHandler.java:deleteContainerBlocks(168))      - Transited Block 2fdb437c-dc51-4d43-8dfb-9aff0c319712 to DELETING state in container 1bfa7356-b07c-4d25-97b2-e66eaff19584
      2017-10-20 17:03:59,236 [IPC Server handler 3 on 52328] INFO  scm.StorageContainerManager (StorageContainerManager.java:sendContainerBlocksDeletionACK(779))      - Purging TXID=1 from block deletion log
      2017-10-20 17:03:59,236 [IPC Server handler 3 on 52328] INFO  scm.StorageContainerManager (StorageContainerManager.java:sendContainerBlocksDeletionACK(779))      - Purging TXID=2 from block deletion log
      2017-10-20 17:03:59,237 [IPC Server handler 3 on 52328] INFO  scm.StorageContainerManager (StorageContainerManager.java:sendContainerBlocksDeletionACK(779))      - Purging TXID=3 from block deletion log
      2017-10-20 17:03:59,237 [IPC Server handler 3 on 52328] INFO  scm.StorageContainerManager (StorageContainerManager.java:sendContainerBlocksDeletionACK(779))      - Purging TXID=4 from block deletion log
      2017-10-20 17:03:59,237 [IPC Server handler 3 on 52328] INFO  scm.StorageContainerManager (StorageContainerManager.java:sendContainerBlocksDeletionACK(779))      - Purging TXID=5 from block deletion log
      2017-10-20 17:03:59,237 [IPC Server handler 3 on 52328] INFO  scm.StorageContainerManager (StorageContainerManager.java:sendContainerBlocksDeletionACK(779))      - Purging TXID=6 from block deletion log
      2017-10-20 17:03:59,237 [IPC Server handler 3 on 52328] INFO  scm.StorageContainerManager (StorageContainerManager.java:sendContainerBlocksDeletionACK(779))      - Purging TXID=7 from block deletion log
      2017-10-20 17:03:59,237 [IPC Server handler 3 on 52328] INFO  scm.StorageContainerManager (StorageContainerManager.java:sendContainerBlocksDeletionACK(779))      - Purging TXID=8 from block deletion log
      2017-10-20 17:03:59,238 [IPC Server handler 3 on 52328] INFO  scm.StorageContainerManager (StorageContainerManager.java:sendContainerBlocksDeletionACK(779))      - Purging TXID=9 from block deletion log
      2017-10-20 17:03:59,238 [IPC Server handler 3 on 52328] INFO  scm.StorageContainerManager (StorageContainerManager.java:sendContainerBlocksDeletionACK(779))      - Purging TXID=10 from block deletion log
      2017-10-20 17:03:59,238 [IPC Server handler 3 on 52328] INFO  scm.StorageContainerManager (StorageContainerManager.java:sendContainerBlocksDeletionACK(779))      - Purging TXID=11 from block deletion log
      2017-10-20 17:03:59,265 [Datanode State Machine Thread - 0] INFO  Configuration.deprecation (Configuration.java:logDeprecation(1306)) - No unit for ozone.scm.heartbeat.interval.seconds(1) assuming SECONDS
      2017-10-20 17:04:00,279 [Datanode State Machine Thread - 0] INFO  Configuration.deprecation (Configuration.java:logDeprecation(1306)) - No unit for ozone.scm.heartbeat.interval.seconds(1) assuming SECONDS
      2017-10-20 17:04:01,266 [Datanode State Machine Thread - 0] INFO  Configuration.deprecation (Configuration.java:logDeprecation(1306)) - No unit for ozone.scm.heartbeat.interval.seconds(1) assuming SECONDS
      2017-10-20 17:04:02,266 [Datanode State Machine Thread - 0] INFO  Configuration.deprecation (Configuration.java:logDeprecation(1306)) - No unit for ozone.scm.heartbeat.interval.seconds(1) assuming SECONDS
      2017-10-20 17:04:03,268 [Datanode State Machine Thread - 0] INFO  Configuration.deprecation (Configuration.java:logDeprecation(1306)) - No unit for ozone.scm.heartbeat.interval.seconds(1) assuming SECONDS
      2017-10-20 17:04:04,278 [Datanode State Machine Thread - 0] INFO  Configuration.deprecation (Configuration.java:logDeprecation(1306)) - No unit for ozone.scm.heartbeat.interval.seconds(1) assuming SECONDS
      2017-10-20 17:04:05,275 [Datanode State Machine Thread - 0] INFO  Configuration.deprecation (Configuration.java:logDeprecation(1306)) - No unit for ozone.scm.heartbeat.interval.seconds(1) assuming SECONDS
      2017-10-20 17:04:06,099 [BlockDeletingService#1] INFO  utils.BackgroundService (BackgroundService.java:run(99)) - Running background service : BlockDeletingService
      2017-10-20 17:04:06,099 [BlockDeletingService#1] INFO  background.BlockDeletingService (BlockDeletingService.java:getTasks(109))      - Plan to choose 10 containers for block deletion, actually returns 1 valid containers.
      2017-10-20 17:04:06,185 [BlockDeletingService#0] INFO  background.BlockDeletingService (BlockDeletingService.java:call(224))      - Container: 1bfa7356-b07c-4d25-97b2-e66eaff19584, deleted blocks: 11, task elapsed time: 84ms
      2017-10-20 17:04:06,265 [Datanode State Machine Thread - 0] INFO  Configuration.deprecation (Configuration.java:logDeprecation(1306)) - No unit for ozone.scm.heartbeat.interval.seconds(1) assuming SECONDS
      2017-10-20 17:04:07,271 [Datanode State Machine Thread - 0] INFO  Configuration.deprecation (Configuration.java:logDeprecation(1306)) - No unit for ozone.scm.heartbeat.interval.seconds(1) assuming SECONDS
      2017-10-20 17:04:08,269 [Datanode State Machine Thread - 0] INFO  Configuration.deprecation (Configuration.java:logDeprecation(1306)) - No unit for ozone.scm.heartbeat.interval.seconds(1) assuming SECONDS
      

      Now three deletion tasks reuse the same interval time for executing periodically, but actually this will lead some problem. From the elapsed time showed in log, service KeyDeletingService will cost the most time, then BlockDeletingService and SCMBlockDeletingService costs least time.
      We should make SCMBlockDeletingService executed frequently than other two service, then the service BlockDeletingService can work more effectively since it relies on SCMBlockDeletingService's completion.

      Attachments

        Activity

          People

            linyiqun Yiqun Lin
            linyiqun Yiqun Lin
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: