Description
Found this while testing Ozone HA with Ratis built from current master:
scm2_1 | 2022-05-08 09:36:23,915 [grpc-default-executor-0] INFO impl.SnapshotInstallationHandler: dd69b21b-bf75-4092-bd68-19f526eb2628@group-9D638F7E0D95: receive installSnapshot: 517d4b01-482b-4891-b0de-80441d4eb637->dd69b21b-bf75-4092-bd68-19f526eb2628#0-t2,notify:(t:1, i:0) scm2_1 | 2022-05-08 09:36:23,923 [grpc-default-executor-0] INFO server.RaftServer$Division: dd69b21b-bf75-4092-bd68-19f526eb2628@group-9D638F7E0D95: change Leader from null to 517d4b01-482b-4891-b0de-80441d4eb637 at term 2 for installSnapshot, leader elected after 2809ms scm2_1 | 2022-05-08 09:36:23,927 [grpc-default-executor-0] INFO impl.SnapshotInstallationHandler: dd69b21b-bf75-4092-bd68-19f526eb2628@group-9D638F7E0D95: Received notification to install snapshot at index 0 scm2_1 | 2022-05-08 09:36:23,958 [grpc-default-executor-0] INFO impl.SnapshotInstallationHandler: dd69b21b-bf75-4092-bd68-19f526eb2628@group-9D638F7E0D95: notifyInstallSnapshot: nextIndex is 0 but the leader's first available index is 0. scm2_1 | 2022-05-08 09:36:23,960 [grpc-default-executor-0] INFO ha.SCMStateMachine: Received install snapshot notification from SCM leader: scm1:9894 with term index: (t:1, i:0) scm2_1 | 2022-05-08 09:36:23,961 [pool-16-thread-1] INFO ha.SCMHAManagerImpl: Downloading checkpoint from leader SCM scm1 and reloading state from the checkpoint. scm2_1 | 2022-05-08 09:36:23,965 [grpc-default-executor-0] ERROR impl.SnapshotInstallationHandler: dd69b21b-bf75-4092-bd68-19f526eb2628@group-9D638F7E0D95: installSnapshot failed scm2_1 | java.lang.IllegalStateException: inProgressInstallSnapshotRequest: 0 is not eligible, firstAvailableLogIndex: 0 scm2_1 | at org.apache.ratis.util.Preconditions.assertTrue(Preconditions.java:60) scm2_1 | at org.apache.ratis.server.impl.SnapshotInstallationHandler.notifyStateMachineToInstallSnapshot(SnapshotInstallationHandler.java:287) scm2_1 | at org.apache.ratis.server.impl.SnapshotInstallationHandler.installSnapshotImpl(SnapshotInstallationHandler.java:115) scm2_1 | at org.apache.ratis.server.impl.SnapshotInstallationHandler.installSnapshot(SnapshotInstallationHandler.java:84) scm2_1 | at org.apache.ratis.server.impl.RaftServerImpl.installSnapshot(RaftServerImpl.java:1427)
The log above is from docker-compose cluster, but it is easily reproducible using MiniCluster-based integration test, too:
mvn clean test -DfailIfNoTests=false -Dtest=TestStorageContainerManagerHA
Attachments
Issue Links
- is related to
-
RATIS-1481 notifyStateMachineToInstallSnapshot stuck in IN_PROGRESS
- Resolved
- links to