Uploaded image for project: 'Apache Ozone'
  1. Apache Ozone
  2. HDDS-5725

Catch possible AlreadyExistsException for create pipeline command.

    XMLWordPrintableJSON

Details

    Description

      Suspicious error logs seen upon pipeline creation on datanode side:

      2021-09-07 06:50:31,920 [Command processor thread] ERROR org.apache.hadoop.ozone.container.common.statemachine.commandhandler.CreatePipelineCommandHandler: Can't create pipeline RATIS THREE PipelineID=6c7e4b36-1f8d-41c1-b304-e8629e383fb3
      java.io.IOException: 686266d4-d95c-46a5-acb0-cf3b8569e527: Failed to add group-E8629E383FB3:[b5f6ebd0-b0dd-4dd2-b320-b8ac57fad57f|rpc:17.16.10.60:9856|admin:17.16.10.60:9857|client:17.16.10.60:19858|priority:1, 686266d4-d95c-46a5-acb0-cf3b8569e527|rpc:17.16.10.66:9856|admin:17.16.10.66:9857|client:17.16.10.66:19858|priority:0, 8b819046-c478-47ec-a37a-ea7053e9a5f8|rpc:17.16.10.57:9856|admin:17.16.10.57:9857|client:17.16.10.57:19858|priority:0] since the group already exists in the map.
              at org.apache.hadoop.ozone.container.common.transport.server.ratis.XceiverServerRatis.addGroup(XceiverServerRatis.java:756)
              at org.apache.hadoop.ozone.container.common.statemachine.commandhandler.CreatePipelineCommandHandler.handle(CreatePipelineCommandHandler.java:92)
              at org.apache.hadoop.ozone.container.common.statemachine.commandhandler.CommandDispatcher.handle(CommandDispatcher.java:99)
              at org.apache.hadoop.ozone.container.common.statemachine.DatanodeStateMachine.lambda$initCommandHandlerThread$2(DatanodeStateMachine.java:555)
              at java.lang.Thread.run(Thread.java:748)
      Caused by: org.apache.ratis.protocol.exceptions.AlreadyExistsException: 686266d4-d95c-46a5-acb0-cf3b8569e527: Failed to add group-E8629E383FB3:[b5f6ebd0-b0dd-4dd2-b320-b8ac57fad57f|rpc:17.16.10.60:9856|admin:17.16.10.60:9857|client:17.16.10.60:19858|priority:1, 686266d4-d95c-46a5-acb0-cf3b8569e527|rpc:17.16.10.66:9856|admin:17.16.10.66:9857|client:17.16.10.66:19858|priority:0, 8b819046-c478-47ec-a37a-ea7053e9a5f8|rpc:17.16.10.57:9856|admin:17.16.10.57:9857|client:17.16.10.57:19858|priority:0] since the group already exists in the map.
              at org.apache.ratis.server.impl.RaftServerProxy$ImplMap.addNew(RaftServerProxy.java:89)
              at org.apache.ratis.server.impl.RaftServerProxy.groupAddAsync(RaftServerProxy.java:472)
              at org.apache.ratis.server.impl.RaftServerProxy.groupManagementAsync(RaftServerProxy.java:456)
              at org.apache.ratis.server.impl.RaftServerProxy.groupManagement(RaftServerProxy.java:440)
              at org.apache.hadoop.ozone.container.common.transport.server.ratis.XceiverServerRatis.addGroup(XceiverServerRatis.java:754)
              ... 4 more
      

      Actually in https://issues.apache.org/jira/browse/HDDS-2679, we add `addGroup` rpc calls from dn to other peers upon a pipeline creation command received, and one dn may either process the received command first or an rpc from a peer first and vice versa.

      Here are have caught the AlreadyExistsException in the rpc reply, and we should also catch it for the creation command handle routine.

       

      Attachments

        Issue Links

          Activity

            People

              markgui Mark Gui
              markgui Mark Gui
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: