Uploaded image for project: 'Apache Ozone'
  1. Apache Ozone
  2. HDDS-1451

SCMBlockManager findPipeline and createPipeline are not lock protected

    XMLWordPrintableJSON

Details

    • Done

    Description

      SCM BlockManager may try to allocate pipelines in the cases when it is not needed. This happens because BlockManagerImpl#allocateBlock is not lock protected, so multiple pipelines can be allocated from it. One of the pipeline allocation can fail even when one of the existing pipeline already exists.

      2019-04-22 22:34:14,336 INFO  pipeline.RatisPipelineProvider (RatisPipelineProvider.java:lambda$create$1(103)) -  pipeline Pipeline[ Id: 6f4bb2d7-d660-4f9f-bc06-72b10f9a738e, Nodes: 76e1a493-fd55-4d67-9f5
      5-c04fd6bd3a33{ip: 192.168.0.104, host: 192.168.0.104, certSerialId: null}2b9850b2-aed3-4a40-91b5-2447dc5246bf{ip: 192.168.0.104, host: 192.168.0.104, certSerialId: null}12248721-ea6a-453f-8dad-fc7fbe692f
      d2{ip: 192.168.0.104, host: 192.168.0.104, certSerialId: null}, Type:RATIS, Factor:THREE, State:OPEN]
      2019-04-22 22:34:14,386 INFO  impl.RoleInfo (RoleInfo.java:shutdownLeaderElection(134)) - e17b7852-4691-40c7-8791-ad0b0da5201f: shutdown LeaderElection
      2019-04-22 22:34:14,388 INFO  pipeline.RatisPipelineProvider (RatisPipelineProvider.java:lambda$create$1(103)) -  pipeline Pipeline[ Id: 552e28f3-98d9-41f3-86e0-c1b9494838a5, Nodes: e17b7852-4691-40c7-879
      1-ad0b0da5201f{ip: 192.168.0.104, host: 192.168.0.104, certSerialId: null}fd365bac-e26e-4b11-afd8-9d08cd1b0521{ip: 192.168.0.104, host: 192.168.0.104, certSerialId: null}9583a007-7f02-4074-9e26-19bc18e29e
      c5{ip: 192.168.0.104, host: 192.168.0.104, certSerialId: null}, Type:RATIS, Factor:THREE, State:OPEN]
      2019-04-22 22:34:14,388 INFO  impl.RoleInfo (RoleInfo.java:updateAndGet(143)) - e17b7852-4691-40c7-8791-ad0b0da5201f: start FollowerState
      2019-04-22 22:34:14,388 INFO  pipeline.RatisPipelineProvider (RatisPipelineProvider.java:lambda$create$1(103)) -  pipeline Pipeline[ Id: 5383151b-d625-4362-a7dd-c0d353acaf76, Nodes: 80f16ad6-3879-4a64-a3c
      7-7719813cc139{ip: 192.168.0.104, host: 192.168.0.104, certSerialId: null}082ce481-7fb0-4f88-ac21-82609290a6a2{ip: 192.168.0.104, host: 192.168.0.104, certSerialId: null}dd5f5a70-0217-4577-b7a2-c42aa139d1
      8a{ip: 192.168.0.104, host: 192.168.0.104, certSerialId: null}, Type:RATIS, Factor:THREE, State:OPEN]
      2019-04-22 22:34:14,389 INFO  pipeline.RatisPipelineProvider (RatisPipelineProvider.java:lambda$create$1(103)) -  pipeline Pipeline[ Id: be4854e5-7933-4caa-b32e-f482cf500247, Nodes: 6e2356f1-479d-498b-876
      a-1c90623c498b{ip: 192.168.0.104, host: 192.168.0.104, certSerialId: null}8ac46d94-9975-4eea-9448-2618c69d7bf3{ip: 192.168.0.104, host: 192.168.0.104, certSerialId: null}a3ed36a1-44ca-47b2-b9b3-5aeef04595
      18{ip: 192.168.0.104, host: 192.168.0.104, certSerialId: null}, Type:RATIS, Factor:THREE, State:OPEN]
      2019-04-22 22:34:14,390 INFO  pipeline.RatisPipelineProvider (RatisPipelineProvider.java:lambda$create$1(103)) -  pipeline Pipeline[ Id: 21e368e2-f82a-4c61-9cc3-06e8de22ea6b, Nodes: 82632040-5754-4122-b187-331879586842{ip: 192.168.0.104, host: 192.168.0.104, certSerialId: null}923c8537-b869-4085-adcb-0a9accdcd089{ip: 192.168.0.104, host: 192.168.0.104, certSerialId: null}c6d790bf-e3a6-4064-acb5-f74796cd38a9{ip: 192.168.0.104, host: 192.168.0.104, certSerialId: null}, Type:RATIS, Factor:THREE, State:OPEN]
      2019-04-22 22:34:14,390 INFO  pipeline.RatisPipelineProvider (RatisPipelineProvider.java:lambda$create$1(103)) -  pipeline Pipeline[ Id: cccbc2ed-e0e2-4578-a8a2-94f4b645be52, Nodes: 91ae6848-a778-43be-a4a1-5855f7adc0d8{ip: 192.168.0.104, host: 192.168.0.104, certSerialId: null}8f330a03-40e2-4bd1-9b43-5e05b13d89f0{ip: 192.168.0.104, host: 192.168.0.104, certSerialId: null}4f3070dc-650b-48d7-87b5-d2076104e7b4{ip: 192.168.0.104, host: 192.168.0.104, certSerialId: null}, Type:RATIS, Factor:THREE, State:OPEN]
      2019-04-22 22:34:14,392 ERROR block.BlockManagerImpl (BlockManagerImpl.java:allocateBlock(192)) - Pipeline creation failed for type:RATIS factor:THREE
      org.apache.hadoop.hdds.scm.pipeline.InsufficientDatanodesException: Cannot create pipeline of factor 3 using 2 nodes 20 healthy nodes 20 all nodes.
              at org.apache.hadoop.hdds.scm.pipeline.RatisPipelineProvider.create(RatisPipelineProvider.java:122)
              at org.apache.hadoop.hdds.scm.pipeline.PipelineFactory.create(PipelineFactory.java:57)
              at org.apache.hadoop.hdds.scm.pipeline.SCMPipelineManager.createPipeline(SCMPipelineManager.java:148)
              at org.apache.hadoop.hdds.scm.block.BlockManagerImpl.allocateBlock(BlockManagerImpl.java:190)
              at org.apache.hadoop.hdds.scm.server.SCMBlockProtocolServer.allocateBlock(SCMBlockProtocolServer.java:172)
              at org.apache.hadoop.ozone.protocolPB.ScmBlockLocationProtocolServerSideTranslatorPB.allocateScmBlock(ScmBlockLocationProtocolServerSideTranslatorPB.java:82)
              at org.apache.hadoop.hdds.protocol.proto.ScmBlockLocationProtocolProtos$ScmBlockLocationProtocolService$2.callBlockingMethod(ScmBlockLocationProtocolProtos.java:7533)
              at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
              at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025)
              at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876)
              at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822)
              at java.security.AccessController.doPrivileged(Native Method)
              at javax.security.auth.Subject.doAs(Subject.java:422)
              at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
              at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)
      2019-04-22 22:34:14,395 ERROR block.BlockManagerImpl (BlockManagerImpl.java:allocateBlock(213)) - Unable to allocate a block for the size: 16384, type: RATIS, factor: THREE
      

      Attachments

        Issue Links

          Activity

            People

              avijayan Aravindan Vijayan
              msingh Mukul Kumar Singh
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 1h 50m
                  1h 50m