Details
-
Sub-task
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
Description
TestSCMHAManagerImpl#testAddSCM fails on ratis master with the following errors.
org.apache.ratis.protocol.exceptions.ReconfigurationTimeoutException: 1c87d01a-bf32-4066-b42b-826ac82dfd8f@group-16EE0E876D68-ConfigurationStagingState: Fail to set configuration peers:[follower|localhost:9898, 1c87d01a-bf32-4066-b42b-826ac82dfd8f|localhost:9894]|listeners:[] due to NOPROGRESS 1546 at org.apache.ratis.server.impl.LeaderStateImpl$ConfigurationStagingState.fail(LeaderStateImpl.java:1262) 1547 at org.apache.ratis.server.impl.LeaderStateImpl.checkStaging(LeaderStateImpl.java:842) 1548 at org.apache.ratis.server.impl.LeaderStateImpl.access$700(LeaderStateImpl.java:101) 1549 at org.apache.ratis.server.impl.LeaderStateImpl$EventProcessor.run(LeaderStateImpl.java:774) 1550
Snapshot installation failed because of IllegalStateException.
2024-05-28 19:54:26,832 [grpc-default-executor-0] ERROR impl.SnapshotInstallationHandler (SnapshotInstallationHandler.java:installSnapshot(99)) - 388e333f-7af0-4453-8dd3-61733cda944d@group-16EE0E876D68: installSnapshot failed java.lang.IllegalStateException at com.google.common.base.Preconditions.checkState(Preconditions.java:496) at org.apache.hadoop.hdds.scm.ha.SCMStateMachine.notifyInstallSnapshotFromLeader(SCMStateMachine.java:239) at org.apache.ratis.server.impl.SnapshotInstallationHandler.notifyStateMachineToInstallSnapshot(SnapshotInstallationHandler.java:267) at org.apache.ratis.server.impl.SnapshotInstallationHandler.installSnapshotImpl(SnapshotInstallationHandler.java:128) at org.apache.ratis.server.impl.SnapshotInstallationHandler.installSnapshot(SnapshotInstallationHandler.java:97) at org.apache.ratis.server.impl.RaftServerImpl.installSnapshot(RaftServerImpl.java:1652) at org.apache.ratis.server.impl.RaftServerProxy.installSnapshot(RaftServerProxy.java:675) at org.apache.ratis.grpc.server.GrpcServerProtocolService$2.process(GrpcServerProtocolService.java:349) at org.apache.ratis.grpc.server.GrpcServerProtocolService$2.process(GrpcServerProtocolService.java:346) at org.apache.ratis.grpc.server.GrpcServerProtocolService$ServerRequestStreamObserver.process(GrpcServerProtocolService.java:106) at org.apache.ratis.grpc.server.GrpcServerProtocolService$ServerRequestStreamObserver.onNext(GrpcServerProtocolService.java:174) at org.apache.ratis.thirdparty.io.grpc.stub.ServerCalls$StreamingServerCallHandler$StreamingServerCallListener.onMessage(ServerCalls.java:262) at org.apache.ratis.thirdparty.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.messagesAvailableInternal(ServerCallImpl.java:329) at org.apache.ratis.thirdparty.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.messagesAvailable(ServerCallImpl.java:314) at org.apache.ratis.thirdparty.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1MessagesAvailable.runInContext(ServerImpl.java:833) at org.apache.ratis.thirdparty.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37) at org.apache.ratis.thirdparty.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:133) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) at java.base/java.lang.Thread.run(Thread.java:840)
Then the following log messages repeat forever.
2024-05-28 15:25:57,037 [grpc-default-executor-0] INFO server.GrpcLogAppender (GrpcLogAppender.java:onNext(674)) - b6dabbe9-8da8-4143-a83f-7f5b0eb07845@group-A959FA14CAD1->follower-InstallSnapshotResponseHandler: InstallSnapshot in progress. 2024-05-28 15:25:57,037 [51eedc82-c249-4ac2-bdcd-05a70b0f674c-server-thread1] INFO server.RaftServer$Division (RaftServerImpl.java:checkInconsistentAppendEntries(1621)) - 51eedc82-c249-4ac2-bdcd-05a70b0f674c@group-A959FA14CAD1: Failed appendEntries as snapshot (0) installation is in progress 2024-05-28 15:25:57,037 [51eedc82-c249-4ac2-bdcd-05a70b0f674c-server-thread1] INFO server.RaftServer$Division (RaftServerImpl.java:appendEntriesAsync(1575)) - 51eedc82-c249-4ac2-bdcd-05a70b0f674c@group-A959FA14CAD1: appendEntries* reply b6dabbe9-8da8-4143-a83f-7f5b0eb07845<-51eedc82-c249-4ac2-bdcd-05a70b0f674c#127986:FAIL-t2,INCONSISTENCY,nextIndex=0,followerCommit=-1,matchIndex=-1 2024-05-28 15:25:57,037 [grpc-default-executor-0] WARN server.GrpcLogAppender (GrpcLogAppender.java:onNextImpl(528)) - b6dabbe9-8da8-4143-a83f-7f5b0eb07845@group-A959FA14CAD1->follower-AppendLogResponseHandler: received INCONSISTENCY reply with nextIndex 0, errorCount=1, request=AppendEntriesRequest:cid=127986,entriesCount=0 2024-05-28 15:25:57,037 [b6dabbe9-8da8-4143-a83f-7f5b0eb07845@group-A959FA14CAD1->follower-GrpcLogAppender-LogAppenderDaemon] INFO server.GrpcLogAppender (GrpcLogAppender.java:notifyInstallSnapshot(799)) - b6dabbe9-8da8-4143-a83f-7f5b0eb07845@group-A959FA14CAD1->follower-GrpcLogAppender: notifyInstallSnapshot with firstAvailable=(t:1, i:0), followerNextIndex=0 2024-05-28 15:25:57,037 [b6dabbe9-8da8-4143-a83f-7f5b0eb07845@group-A959FA14CAD1->follower-GrpcLogAppender-LogAppenderDaemon] INFO server.GrpcLogAppender (GrpcLogAppender.java:notifyInstallSnapshot(807)) - b6dabbe9-8da8-4143-a83f-7f5b0eb07845@group-A959FA14CAD1->follower-GrpcLogAppender: send b6dabbe9-8da8-4143-a83f-7f5b0eb07845->follower#0-t2,notify:(t:1, i:0) 2024-05-28 15:25:57,037 [grpc-default-executor-0] INFO impl.SnapshotInstallationHandler (SnapshotInstallationHandler.java:installSnapshot(92)) - 51eedc82-c249-4ac2-bdcd-05a70b0f674c@group-A959FA14CAD1: receive installSnapshot: b6dabbe9-8da8-4143-a83f-7f5b0eb07845->follower#0-t2,notify:(t:1, i:0) 2024-05-28 15:25:57,037 [grpc-default-executor-0] INFO impl.SnapshotInstallationHandler (SnapshotInstallationHandler.java:installSnapshot(103)) - 51eedc82-c249-4ac2-bdcd-05a70b0f674c@group-A959FA14CAD1: reply installSnapshot: b6dabbe9-8da8-4143-a83f-7f5b0eb07845<-51eedc82-c249-4ac2-bdcd-05a70b0f674c#0:FAIL-t2,IN_PROGRESS,snapshotIndex=0 2024-05-28 15:25:57,037 [grpc-default-executor-0] INFO server.GrpcServerProtocolService (GrpcServerProtocolService.java:onCompleted(200)) - 51eedc82-c249-4ac2-bdcd-05a70b0f674c: Completed INSTALL_SNAPSHOT, lastRequest: b6dabbe9-8da8-4143-a83f-7f5b0eb07845->follower#0-t2,notify:(t:1, i:0) 2024-05-28 15:25:57,037 [grpc-default-executor-0] INFO server.GrpcServerProtocolService (GrpcServerProtocolService.java:lambda$onCompleted$7(202)) - 51eedc82-c249-4ac2-bdcd-05a70b0f674c: Completed INSTALL_SNAPSHOT, lastReply: null 2024-05-28 15:25:57,037 [grpc-default-executor-5] INFO server.GrpcLogAppender (GrpcLogAppender.java:onNext(658)) - b6dabbe9-8da8-4143-a83f-7f5b0eb07845@group-A959FA14CAD1->follower-InstallSnapshotResponseHandler: received a reply b6dabbe9-8da8-4143-a83f-7f5b0eb07845<-51eedc82-c249-4ac2-bdcd-05a70b0f674c#0:FAIL-t2,IN_PROGRESS,snapshotIndex=0 2024-05-28 15:25:57,037 [grpc-default-executor-5] INFO server.GrpcLogAppender (GrpcLogAppender.java:onNext(674)) - b6dabbe9-8da8-4143-a83f-7f5b0eb07845@group-A959FA14CAD1->follower-InstallSnapshotResponseHandler: InstallSnapshot in progress. 2024-05-28 15:25:57,037 [51eedc82-c249-4ac2-bdcd-05a70b0f674c-server-thread1] INFO server.RaftServer$Division (RaftServerImpl.java:checkInconsistentAppendEntries(1621)) - 51eedc82-c249-4ac2-bdcd-05a70b0f674c@group-A959FA14CAD1: Failed appendEntries as snapshot (0) installation is in progress 2024-05-28 15:25:57,037 [51eedc82-c249-4ac2-bdcd-05a70b0f674c-server-thread1] INFO server.RaftServer$Division (RaftServerImpl.java:appendEntriesAsync(1575)) - 51eedc82-c249-4ac2-bdcd-05a70b0f674c@group-A959FA14CAD1: appendEntries* reply b6dabbe9-8da8-4143-a83f-7f5b0eb07845<-51eedc82-c249-4ac2-bdcd-05a70b0f674c#127987:FAIL-t2,INCONSISTENCY,nextIndex=0,followerCommit=-1,matchIndex=-1 2024-05-28 15:25:57,038 [grpc-default-executor-5] WARN server.GrpcLogAppender (GrpcLogAppender.java:onNextImpl(528)) - b6dabbe9-8da8-4143-a83f-7f5b0eb07845@group-A959FA14CAD1->follower-AppendLogResponseHandler: received INCONSISTENCY reply with nextIndex 0, errorCount=1, request=AppendEntriesRequest:cid=127987,entriesCount=0 2024-05-28 15:25:57,038 [b6dabbe9-8da8-4143-a83f-7f5b0eb07845@group-A959FA14CAD1->follower-GrpcLogAppender-LogAppenderDaemon] INFO server.GrpcLogAppender (GrpcLogAppender.java:notifyInstallSnapshot(799)) - b6dabbe9-8da8-4143-a83f-7f5b0eb07845@group-A959FA14CAD1->follower-GrpcLogAppender: notifyInstallSnapshot with firstAvailable=(t:1, i:0), followerNextIndex=0 2024-05-28 15:25:57,038 [b6dabbe9-8da8-4143-a83f-7f5b0eb07845@group-A959FA14CAD1->follower-GrpcLogAppender-LogAppenderDaemon] INFO server.GrpcLogAppender (GrpcLogAppender.java:notifyInstallSnapshot(807)) - b6dabbe9-8da8-4143-a83f-7f5b0eb07845@group-A959FA14CAD1->follower-GrpcLogAppender: send b6dabbe9-8da8-4143-a83f-7f5b0eb07845->follower#0-t2,notify:(t:1, i:0) 2024-05-28 15:25:57,038 [grpc-default-executor-5] INFO impl.SnapshotInstallationHandler (SnapshotInstallationHandler.java:installSnapshot(92)) - 51eedc82-c249-4ac2-bdcd-05a70b0f674c@group-A959FA14CAD1: receive installSnapshot: b6dabbe9-8da8-4143-a83f-7f5b0eb07845->follower#0-t2,notify:(t:1, i:0) 2024-05-28 15:25:57,038 [grpc-default-executor-5] INFO impl.SnapshotInstallationHandler (SnapshotInstallationHandler.java:installSnapshot(103)) - 51eedc82-c249-4ac2-bdcd-05a70b0f674c@group-A959FA14CAD1: reply installSnapshot: b6dabbe9-8da8-4143-a83f-7f5b0eb07845<-51eedc82-c249-4ac2-bdcd-05a70b0f674c#0:FAIL-t2,IN_PROGRESS,snapshotIndex=0 2024-05-28 15:25:57,038 [grpc-default-executor-5] INFO server.GrpcServerProtocolService (GrpcServerProtocolService.java:onCompleted(200)) - 51eedc82-c249-4ac2-bdcd-05a70b0f674c: Completed INSTALL_SNAPSHOT, lastRequest: b6dabbe9-8da8-4143-a83f-7f5b0eb07845->follower#0-t2,notify:(t:1, i:0) 2024-05-28 15:25:57,038 [grpc-default-executor-5] INFO server.GrpcServerProtocolService (GrpcServerProtocolService.java:lambda$onCompleted$7(202)) - 51eedc82-c249-4ac2-bdcd-05a70b0f674c: Completed INSTALL_SNAPSHOT, lastReply: null 2024-05-28 15:25:57,038 [grpc-default-executor-0] INFO server.GrpcLogAppender (GrpcLogAppender.java:onNext(658)) - b6dabbe9-8da8-4143-a83f-7f5b0eb07845@group-A959FA14CAD1->follower-InstallSnapshotResponseHandler: received a reply b6dabbe9-8da8-4143-a83f-7f5b0eb07845<-51eedc82-c249-4ac2-bdcd-05a70b0f674c#0:FAIL-t2,IN_PROGRESS,snapshotIndex=0
Attachments
Issue Links
- relates to
-
HDDS-10696 Ozone integration test fails because of empty snapshot installation.
- Resolved
- links to