Uploaded image for project: 'Apache Ozone'
  1. Apache Ozone
  2. HDDS-10927 Upgrade ratis to 3.1.0 and adopt zero-copy
  3. HDDS-10924

TestSCMHAManagerImpl#testAddSCM fails on ratis master

Log workAgile BoardRank to TopRank to BottomAttach filesAttach ScreenshotBulk Copy AttachmentsBulk Move AttachmentsVotersWatch issueWatchersConvert to IssueLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    Description

      TestSCMHAManagerImpl#testAddSCM fails on ratis master with the following errors.

      org.apache.ratis.protocol.exceptions.ReconfigurationTimeoutException: 1c87d01a-bf32-4066-b42b-826ac82dfd8f@group-16EE0E876D68-ConfigurationStagingState: Fail to set configuration peers:[follower|localhost:9898, 1c87d01a-bf32-4066-b42b-826ac82dfd8f|localhost:9894]|listeners:[] due to NOPROGRESS
      1546	at org.apache.ratis.server.impl.LeaderStateImpl$ConfigurationStagingState.fail(LeaderStateImpl.java:1262)
      1547	at org.apache.ratis.server.impl.LeaderStateImpl.checkStaging(LeaderStateImpl.java:842)
      1548	at org.apache.ratis.server.impl.LeaderStateImpl.access$700(LeaderStateImpl.java:101)
      1549	at org.apache.ratis.server.impl.LeaderStateImpl$EventProcessor.run(LeaderStateImpl.java:774)
      1550 

      Snapshot installation failed because of IllegalStateException.

      2024-05-28 19:54:26,832 [grpc-default-executor-0] ERROR impl.SnapshotInstallationHandler (SnapshotInstallationHandler.java:installSnapshot(99)) - 388e333f-7af0-4453-8dd3-61733cda944d@group-16EE0E876D68: installSnapshot failed
      java.lang.IllegalStateException
          at com.google.common.base.Preconditions.checkState(Preconditions.java:496)
          at org.apache.hadoop.hdds.scm.ha.SCMStateMachine.notifyInstallSnapshotFromLeader(SCMStateMachine.java:239)
          at org.apache.ratis.server.impl.SnapshotInstallationHandler.notifyStateMachineToInstallSnapshot(SnapshotInstallationHandler.java:267)
          at org.apache.ratis.server.impl.SnapshotInstallationHandler.installSnapshotImpl(SnapshotInstallationHandler.java:128)
          at org.apache.ratis.server.impl.SnapshotInstallationHandler.installSnapshot(SnapshotInstallationHandler.java:97)
          at org.apache.ratis.server.impl.RaftServerImpl.installSnapshot(RaftServerImpl.java:1652)
          at org.apache.ratis.server.impl.RaftServerProxy.installSnapshot(RaftServerProxy.java:675)
          at org.apache.ratis.grpc.server.GrpcServerProtocolService$2.process(GrpcServerProtocolService.java:349)
          at org.apache.ratis.grpc.server.GrpcServerProtocolService$2.process(GrpcServerProtocolService.java:346)
          at org.apache.ratis.grpc.server.GrpcServerProtocolService$ServerRequestStreamObserver.process(GrpcServerProtocolService.java:106)
          at org.apache.ratis.grpc.server.GrpcServerProtocolService$ServerRequestStreamObserver.onNext(GrpcServerProtocolService.java:174)
          at org.apache.ratis.thirdparty.io.grpc.stub.ServerCalls$StreamingServerCallHandler$StreamingServerCallListener.onMessage(ServerCalls.java:262)
          at org.apache.ratis.thirdparty.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.messagesAvailableInternal(ServerCallImpl.java:329)
          at org.apache.ratis.thirdparty.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.messagesAvailable(ServerCallImpl.java:314)
          at org.apache.ratis.thirdparty.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1MessagesAvailable.runInContext(ServerImpl.java:833)
          at org.apache.ratis.thirdparty.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
          at org.apache.ratis.thirdparty.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:133)
          at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
          at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
          at java.base/java.lang.Thread.run(Thread.java:840) 

      Then the following log messages repeat forever.

      2024-05-28 15:25:57,037 [grpc-default-executor-0] INFO  server.GrpcLogAppender (GrpcLogAppender.java:onNext(674)) - b6dabbe9-8da8-4143-a83f-7f5b0eb07845@group-A959FA14CAD1->follower-InstallSnapshotResponseHandler: InstallSnapshot in progress.
      2024-05-28 15:25:57,037 [51eedc82-c249-4ac2-bdcd-05a70b0f674c-server-thread1] INFO  server.RaftServer$Division (RaftServerImpl.java:checkInconsistentAppendEntries(1621)) - 51eedc82-c249-4ac2-bdcd-05a70b0f674c@group-A959FA14CAD1: Failed appendEntries as snapshot (0) installation is in progress
      2024-05-28 15:25:57,037 [51eedc82-c249-4ac2-bdcd-05a70b0f674c-server-thread1] INFO  server.RaftServer$Division (RaftServerImpl.java:appendEntriesAsync(1575)) - 51eedc82-c249-4ac2-bdcd-05a70b0f674c@group-A959FA14CAD1: appendEntries* reply b6dabbe9-8da8-4143-a83f-7f5b0eb07845<-51eedc82-c249-4ac2-bdcd-05a70b0f674c#127986:FAIL-t2,INCONSISTENCY,nextIndex=0,followerCommit=-1,matchIndex=-1
      2024-05-28 15:25:57,037 [grpc-default-executor-0] WARN  server.GrpcLogAppender (GrpcLogAppender.java:onNextImpl(528)) - b6dabbe9-8da8-4143-a83f-7f5b0eb07845@group-A959FA14CAD1->follower-AppendLogResponseHandler: received INCONSISTENCY reply with nextIndex 0, errorCount=1, request=AppendEntriesRequest:cid=127986,entriesCount=0
      2024-05-28 15:25:57,037 [b6dabbe9-8da8-4143-a83f-7f5b0eb07845@group-A959FA14CAD1->follower-GrpcLogAppender-LogAppenderDaemon] INFO  server.GrpcLogAppender (GrpcLogAppender.java:notifyInstallSnapshot(799)) - b6dabbe9-8da8-4143-a83f-7f5b0eb07845@group-A959FA14CAD1->follower-GrpcLogAppender: notifyInstallSnapshot with firstAvailable=(t:1, i:0), followerNextIndex=0
      2024-05-28 15:25:57,037 [b6dabbe9-8da8-4143-a83f-7f5b0eb07845@group-A959FA14CAD1->follower-GrpcLogAppender-LogAppenderDaemon] INFO  server.GrpcLogAppender (GrpcLogAppender.java:notifyInstallSnapshot(807)) - b6dabbe9-8da8-4143-a83f-7f5b0eb07845@group-A959FA14CAD1->follower-GrpcLogAppender: send b6dabbe9-8da8-4143-a83f-7f5b0eb07845->follower#0-t2,notify:(t:1, i:0)
      2024-05-28 15:25:57,037 [grpc-default-executor-0] INFO  impl.SnapshotInstallationHandler (SnapshotInstallationHandler.java:installSnapshot(92)) - 51eedc82-c249-4ac2-bdcd-05a70b0f674c@group-A959FA14CAD1: receive installSnapshot: b6dabbe9-8da8-4143-a83f-7f5b0eb07845->follower#0-t2,notify:(t:1, i:0)
      2024-05-28 15:25:57,037 [grpc-default-executor-0] INFO  impl.SnapshotInstallationHandler (SnapshotInstallationHandler.java:installSnapshot(103)) - 51eedc82-c249-4ac2-bdcd-05a70b0f674c@group-A959FA14CAD1: reply installSnapshot: b6dabbe9-8da8-4143-a83f-7f5b0eb07845<-51eedc82-c249-4ac2-bdcd-05a70b0f674c#0:FAIL-t2,IN_PROGRESS,snapshotIndex=0
      2024-05-28 15:25:57,037 [grpc-default-executor-0] INFO  server.GrpcServerProtocolService (GrpcServerProtocolService.java:onCompleted(200)) - 51eedc82-c249-4ac2-bdcd-05a70b0f674c: Completed INSTALL_SNAPSHOT, lastRequest: b6dabbe9-8da8-4143-a83f-7f5b0eb07845->follower#0-t2,notify:(t:1, i:0)
      2024-05-28 15:25:57,037 [grpc-default-executor-0] INFO  server.GrpcServerProtocolService (GrpcServerProtocolService.java:lambda$onCompleted$7(202)) - 51eedc82-c249-4ac2-bdcd-05a70b0f674c: Completed INSTALL_SNAPSHOT, lastReply: null
      2024-05-28 15:25:57,037 [grpc-default-executor-5] INFO  server.GrpcLogAppender (GrpcLogAppender.java:onNext(658)) - b6dabbe9-8da8-4143-a83f-7f5b0eb07845@group-A959FA14CAD1->follower-InstallSnapshotResponseHandler: received a reply b6dabbe9-8da8-4143-a83f-7f5b0eb07845<-51eedc82-c249-4ac2-bdcd-05a70b0f674c#0:FAIL-t2,IN_PROGRESS,snapshotIndex=0
      2024-05-28 15:25:57,037 [grpc-default-executor-5] INFO  server.GrpcLogAppender (GrpcLogAppender.java:onNext(674)) - b6dabbe9-8da8-4143-a83f-7f5b0eb07845@group-A959FA14CAD1->follower-InstallSnapshotResponseHandler: InstallSnapshot in progress.
      2024-05-28 15:25:57,037 [51eedc82-c249-4ac2-bdcd-05a70b0f674c-server-thread1] INFO  server.RaftServer$Division (RaftServerImpl.java:checkInconsistentAppendEntries(1621)) - 51eedc82-c249-4ac2-bdcd-05a70b0f674c@group-A959FA14CAD1: Failed appendEntries as snapshot (0) installation is in progress
      2024-05-28 15:25:57,037 [51eedc82-c249-4ac2-bdcd-05a70b0f674c-server-thread1] INFO  server.RaftServer$Division (RaftServerImpl.java:appendEntriesAsync(1575)) - 51eedc82-c249-4ac2-bdcd-05a70b0f674c@group-A959FA14CAD1: appendEntries* reply b6dabbe9-8da8-4143-a83f-7f5b0eb07845<-51eedc82-c249-4ac2-bdcd-05a70b0f674c#127987:FAIL-t2,INCONSISTENCY,nextIndex=0,followerCommit=-1,matchIndex=-1
      2024-05-28 15:25:57,038 [grpc-default-executor-5] WARN  server.GrpcLogAppender (GrpcLogAppender.java:onNextImpl(528)) - b6dabbe9-8da8-4143-a83f-7f5b0eb07845@group-A959FA14CAD1->follower-AppendLogResponseHandler: received INCONSISTENCY reply with nextIndex 0, errorCount=1, request=AppendEntriesRequest:cid=127987,entriesCount=0
      2024-05-28 15:25:57,038 [b6dabbe9-8da8-4143-a83f-7f5b0eb07845@group-A959FA14CAD1->follower-GrpcLogAppender-LogAppenderDaemon] INFO  server.GrpcLogAppender (GrpcLogAppender.java:notifyInstallSnapshot(799)) - b6dabbe9-8da8-4143-a83f-7f5b0eb07845@group-A959FA14CAD1->follower-GrpcLogAppender: notifyInstallSnapshot with firstAvailable=(t:1, i:0), followerNextIndex=0
      2024-05-28 15:25:57,038 [b6dabbe9-8da8-4143-a83f-7f5b0eb07845@group-A959FA14CAD1->follower-GrpcLogAppender-LogAppenderDaemon] INFO  server.GrpcLogAppender (GrpcLogAppender.java:notifyInstallSnapshot(807)) - b6dabbe9-8da8-4143-a83f-7f5b0eb07845@group-A959FA14CAD1->follower-GrpcLogAppender: send b6dabbe9-8da8-4143-a83f-7f5b0eb07845->follower#0-t2,notify:(t:1, i:0)
      2024-05-28 15:25:57,038 [grpc-default-executor-5] INFO  impl.SnapshotInstallationHandler (SnapshotInstallationHandler.java:installSnapshot(92)) - 51eedc82-c249-4ac2-bdcd-05a70b0f674c@group-A959FA14CAD1: receive installSnapshot: b6dabbe9-8da8-4143-a83f-7f5b0eb07845->follower#0-t2,notify:(t:1, i:0)
      2024-05-28 15:25:57,038 [grpc-default-executor-5] INFO  impl.SnapshotInstallationHandler (SnapshotInstallationHandler.java:installSnapshot(103)) - 51eedc82-c249-4ac2-bdcd-05a70b0f674c@group-A959FA14CAD1: reply installSnapshot: b6dabbe9-8da8-4143-a83f-7f5b0eb07845<-51eedc82-c249-4ac2-bdcd-05a70b0f674c#0:FAIL-t2,IN_PROGRESS,snapshotIndex=0
      2024-05-28 15:25:57,038 [grpc-default-executor-5] INFO  server.GrpcServerProtocolService (GrpcServerProtocolService.java:onCompleted(200)) - 51eedc82-c249-4ac2-bdcd-05a70b0f674c: Completed INSTALL_SNAPSHOT, lastRequest: b6dabbe9-8da8-4143-a83f-7f5b0eb07845->follower#0-t2,notify:(t:1, i:0)
      2024-05-28 15:25:57,038 [grpc-default-executor-5] INFO  server.GrpcServerProtocolService (GrpcServerProtocolService.java:lambda$onCompleted$7(202)) - 51eedc82-c249-4ac2-bdcd-05a70b0f674c: Completed INSTALL_SNAPSHOT, lastReply: null
      2024-05-28 15:25:57,038 [grpc-default-executor-0] INFO  server.GrpcLogAppender (GrpcLogAppender.java:onNext(658)) - b6dabbe9-8da8-4143-a83f-7f5b0eb07845@group-A959FA14CAD1->follower-InstallSnapshotResponseHandler: received a reply b6dabbe9-8da8-4143-a83f-7f5b0eb07845<-51eedc82-c249-4ac2-bdcd-05a70b0f674c#0:FAIL-t2,IN_PROGRESS,snapshotIndex=0 

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            adoroszlai Attila Doroszlai Assign to me
            duongnguyen Duong
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment