Uploaded image for project: 'Ratis'
  1. Ratis
  2. RATIS-1369

When there is no snapshot, the return snapshotIndex should be -1, not 0.

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 2.1.0
    • server
    • None

    Description

      For now, if there is no snapshot existed, the returned snapshotIndex is 0.

      It is not correctly, since the raft log starts from index 0 (BTW, in raft paper, the first valid log index is 1, but in ratis, it is 0), a snapshot taken at snapshotIndex 0 should contain the log entry 0. The snapshotIndex indicating an empty/fake snapshot should be -1.

       

      The fix is as follows: 

      diff --git a/ratis-server/src/main/java/org/apache/ratis/server/impl/ServerState.java b/ratis-server/src/main/java/org/apache/ratis/server/impl/ServerState.java
      index 728f7e9c..6307ca79 100644
      --- a/ratis-server/src/main/java/org/apache/ratis/server/impl/ServerState.java
      +++ b/ratis-server/src/main/java/org/apache/ratis/server/impl/ServerState.java
      @@ -431,7 +431,7 @@ class ServerState implements Closeable {
       
         long getLatestInstalledSnapshotIndex() {
           final TermIndex ti = latestInstalledSnapshot.get();
      -    return ti != null? ti.getIndex(): 0L;
      +    return ti != null? ti.getIndex(): -1L;
         }
       
         /**
      @@ -440,7 +440,7 @@ class ServerState implements Closeable {
          */
         long getSnapshotIndex() {
           final SnapshotInfo s = getLatestSnapshot();
      -    final long latestSnapshotIndex = s != null ? s.getIndex() : 0;
      +    final long latestSnapshotIndex = s != null ? s.getIndex() : -1;
           return Math.max(latestSnapshotIndex, getLatestInstalledSnapshotIndex());
         }
      

       

      This issue is found in the test of bootstrap SCM in SCM HA. 

       

      TL; DR

      The issue can be reproduced by follow steps:

      1, use follow configuration for SCM HA

        <!-- scm snapshot -->
        <property>
          <name>ozone.scm.ha.ratis.snapshot.threshold</name>
          <value>1</value>
        </property>
        <property>
          <name>ozone.scm.ha.ratis.log.purge.gap</name>
          <value>5</value>
        </property>
        <property>
          <name>ozone.scm.ha.ratis.log.purge.enabled</name>
          <value>true</value>
        </property>
      

      2, int and start SCM1

      3, start DN1, DN2, DN3

      4, wait for a while, until there is dozens of raft log entries.

      5, bootstrap SCM2

      6, start SCM2, SCM1 will be stuck in appendEntries Timeout.

       

       The RCA is:

      after init SCM1, the raft log is 

      [root@9 ~/glengeng/ozone-1.2.0-SNAPSHOT]# ls /tmp/metadata/scm-ha/b7a11f50-85b8-4b93-8a6b-c00705b4e3d4/current/
      log_inprogress_0  raft-meta  raft-meta.conf
      [root@9 ~/glengeng/ozone-1.2.0-SNAPSHOT]# 
      

       

      after SCM1 is started, and there is dozens of log entries generated, the raft log is

      [root@9 ~/glengeng/ozone-1.2.0-SNAPSHOT]# ls /tmp/metadata/scm-ha/b7a11f50-85b8-4b93-8a6b-c00705b4e3d4/current/
      log_inprogress_1  raft-meta  raft-meta.conf
      [root@9 ~/glengeng/ozone-1.2.0-SNAPSHOT]# 
      

      Since purge is enable and purge gap is 5, the file log_0_0 is removed. The leader does not have entry 0 any more.

       

      Then SCM2 is bootstrapped and started, we expect SCM2 to download a snapshot from leader, since leader miss log entry 0, and SCM2 is empty.

      The follow SCM do receive an installSnapshot request,

      021-04-25 19:25:55,933 [grpc-default-executor-0] INFO org.apache.ratis.server.RaftServer$Division: 55560a65-e7f7-46f0-b463-f511187fd358@group-7587E602AEC2: receive installSnapshot: 630b3fbd-861f-4676-8e16-8ae548ed6658->55560a65-e7f7-46f0-b463-f511187fd358#0-t2,notify:(t:2, i:1)
      

      But due to the default snapshot index is 0, 

      if (inProgressInstallSnapshotRequest.compareAndSet(null, firstAvailableLogTermIndex)) {
      
        // Check if snapshot index is already at par or ahead of the first
        // available log index of the Leader.
        long snapshotIndex = state.getSnapshotIndex();
        if (snapshotIndex + 1 >= firstAvailableLogIndex) {
          // State Machine has already installed the snapshot. Return the
          // latest snapshot index to the Leader.
      
          inProgressInstallSnapshotRequest.compareAndSet(firstAvailableLogTermIndex, null);
          final InstallSnapshotReplyProto reply = ServerProtoUtils.toInstallSnapshotReplyProto(
              leaderId, getMemberId(), currentTerm, InstallSnapshotResult.ALREADY_INSTALLED, snapshotIndex);
          LOG.info("{}: StateMachine snapshotIndex is {}", getMemberId(), snapshotIndex);
          return reply;
        }
      

      0(snapshotIndex) + 1 == 1(firstAvailableLogIndex), thus follower SCM ignore the  downloading.

      2021-04-25 19:25:55,933 [grpc-default-executor-0] INFO org.apache.ratis.server.RaftServer$Division: 55560a65-e7f7-46f0-b463-f511187fd358@group-7587E602AEC2: receive installSnapshot: 630b3fbd-861f-4676-8e16-8ae548ed6658->55560a65-e7f7-46f0-b463-f511187fd358#0-t2,notify:(t:2, i:1)
      2021-04-25 19:25:55,940 [grpc-default-executor-0] INFO org.apache.hadoop.hdds.scm.ha.SCMStateMachine: leader changed, yet current SCM is still follower.
      2021-04-25 19:25:55,940 [grpc-default-executor-0] INFO org.apache.ratis.server.RaftServer$Division: 55560a65-e7f7-46f0-b463-f511187fd358@group-7587E602AEC2: change Leader from null to 630b3fbd-861f-4676-8e16-8ae548ed6658 at term 2 for installSnapshot, leader elected after 1416ms
      2021-04-25 19:25:55,954 [grpc-default-executor-0] INFO org.apache.ratis.server.RaftServer$Division: 55560a65-e7f7-46f0-b463-f511187fd358@group-7587E602AEC2: StateMachine snapshotIndex is 0
      2021-04-25 19:25:56,019 [grpc-default-executor-0] INFO org.apache.ratis.server.RaftServer$Division: 55560a65-e7f7-46f0-b463-f511187fd358@group-7587E602AEC2: set new configuration index: 1
      configurationEntry {
        peers {
          id: "630b3fbd-861f-4676-8e16-8ae548ed6658"
          address: "9.134.50.210:9865"
        }
      }
       from snapshot
      2021-04-25 19:25:56,022 [grpc-default-executor-0] DEBUG org.apache.ratis.util.LifeCycle: 630b3fbd-861f-4676-8e16-8ae548ed6658|rpc:9.134.50.210:9865|admin:|client:|dataStream:|priority:0: NEW
      2021-04-25 19:25:56,022 [grpc-default-executor-0] INFO org.apache.ratis.server.RaftServer$Division: 55560a65-e7f7-46f0-b463-f511187fd358@group-7587E602AEC2: set configuration 1: [630b3fbd-861f-4676-8e16-8ae548ed6658|rpc:9.134.50.210:9865|admin:|client:|dataStream:|priority:0], old=null
      2021-04-25 19:25:56,026 [grpc-default-executor-0] INFO org.apache.ratis.server.RaftServer$Division: 55560a65-e7f7-46f0-b463-f511187fd358@group-7587E602AEC2: reply installSnapshot: 630b3fbd-861f-4676-8e16-8ae548ed6658<-55560a65-e7f7-46f0-b463-f511187fd358#0:FAIL-t0,ALREADY_INSTALLED
      2021-04-25 19:25:56,026 [grpc-default-executor-0] DEBUG org.apache.ratis.grpc.server.GrpcServerProtocolService: 55560a65-e7f7-46f0-b463-f511187fd358: reply 630b3fbd-861f-4676-8e16-8ae548ed6658<-55560a65-e7f7-46f0-b463-f511187fd358#0:FAIL-t0,ALREADY_INSTALLED
      

       

      In the leader side, the matchIndex is updated to 0, and the nextIndex is updated to 1.

      @Override
      public void setSnapshotIndex(long newSnapshotIndex) {
        snapshotIndex.setUnconditionally(newSnapshotIndex, infoIndexChange);
        matchIndex.setUnconditionally(newSnapshotIndex, infoIndexChange);
        nextIndex.setUnconditionally(newSnapshotIndex + 1, infoIndexChange);
      }
      
      2021-04-25 19:23:58,738 [grpc-default-executor-0] INFO org.apache.ratis.grpc.server.GrpcLogAppender: 630b3fbd-861f-4676-8e16-8ae548ed6658@group-7587E602AEC2->55560a65-e7f7-46f0-b463-f511187fd358-InstallSnapshotResponseHandler: received the first reply 630b3fbd-861f-4676-8e16-8ae548ed6658<-55560a65-e7f7-46f0-b463-f511187fd358#0:FAIL-t0,ALREADY_INSTALLED
      2021-04-25 19:23:58,740 [grpc-default-executor-0] INFO org.apache.ratis.grpc.server.GrpcLogAppender: 630b3fbd-861f-4676-8e16-8ae548ed6658@group-7587E602AEC2->55560a65-e7f7-46f0-b463-f511187fd358-InstallSnapshotResponseHandler: Already Installed Snapshot Index 0.
      2021-04-25 19:23:58,741 [grpc-default-executor-0] INFO org.apache.ratis.server.leader.FollowerInfo: 630b3fbd-861f-4676-8e16-8ae548ed6658@group-7587E602AEC2->55560a65-e7f7-46f0-b463-f511187fd358: snapshotIndex: setUnconditionally 0 -> 0
      2021-04-25 19:23:58,741 [grpc-default-executor-0] INFO org.apache.ratis.server.leader.FollowerInfo: 630b3fbd-861f-4676-8e16-8ae548ed6658@group-7587E602AEC2->55560a65-e7f7-46f0-b463-f511187fd358: matchIndex: setUnconditionally 0 -> 0
      2021-04-25 19:23:58,741 [grpc-default-executor-0] INFO org.apache.ratis.server.leader.FollowerInfo: 630b3fbd-861f-4676-8e16-8ae548ed6658@group-7587E602AEC2->55560a65-e7f7-46f0-b463-f511187fd358: nextIndex: setUnconditionally 0 -> 1
      2021-04-25 19:23:58,741 [grpc-default-executor-0] DEBUG org.apache.ratis.server.leader.FollowerInfo: 630b3fbd-861f-4676-8e16-8ae548ed6658@group-7587E602AEC2->55560a65-e7f7-46f0-b463-f511187fd358: commitIndex: updateToMax old=-1, new=0, updated? true
      

       

      Since matchIndex is monotonically increased, and the nextIndex will at least be matchIndex + 1, 

      final long nextIndex = 1 + Optional.ofNullable(request)
          .map(AppendEntriesRequest::getPreviousLog)
          .map(TermIndex::getIndex)
          .orElseGet(getFollower()::getMatchIndex);
      

      Leader keep sending entries from index 1, but follower is empty, missing entry 0, the cluster stuck in inconsistency append entries.

       

      With the fix, the download of checkpoint happened, and the system works well.

      2021-04-26 11:23:10,761 [grpc-default-executor-0] INFO org.apache.ratis.server.RaftServer$Division: a0ba8aa4-5cff-4452-9adf-638135a9ebaf@group-C00705B4E3D4: receive installSnapshot: 6e7bb411-69c5-40ea-8238-bc65cc4e9b55->a0ba8aa4-5cff-4452-9adf-638135a9ebaf#0-t2,notify:(t:2, i:1)
      2021-04-26 11:23:10,768 [grpc-default-executor-0] INFO org.apache.hadoop.hdds.scm.ha.SCMStateMachine: leader changed, yet current SCM is still follower.
      2021-04-26 11:23:10,768 [grpc-default-executor-0] INFO org.apache.ratis.server.RaftServer$Division: a0ba8aa4-5cff-4452-9adf-638135a9ebaf@group-C00705B4E3D4: change Leader from null to 6e7bb411-69c5-40ea-8238-bc65cc4e9b55 at term 2 for installSnapshot, leader elected after 1391ms
      2021-04-26 11:23:10,787 [grpc-default-executor-0] INFO org.apache.ratis.server.RaftServer$Division: a0ba8aa4-5cff-4452-9adf-638135a9ebaf@group-C00705B4E3D4: notifyInstallSnapshot: nextIndex is 0 but the leader's first available index is 1.
      2021-04-26 11:23:11,090 [grpc-default-executor-0] INFO org.apache.hadoop.hdds.scm.ha.SCMStateMachine: Received install snapshot notification from SCM leader: 9.134.50.210:9865 with term index: (t:2, i:1)
      2021-04-26 11:23:11,091 [pool-13-thread-1] INFO org.apache.hadoop.hdds.scm.ha.SCMHAManagerImpl: Downloading checkpoint from leader SCM scm1 and reloading state from the checkpoint.
      2021-04-26 11:23:11,092 [grpc-default-executor-0] DEBUG org.apache.ratis.server.RaftServer$Division: a0ba8aa4-5cff-4452-9adf-638135a9ebaf@group-C00705B4E3D4: Snapshot Installation Request received and is in progress
      2021-04-26 11:23:11,164 [grpc-default-executor-0] INFO org.apache.ratis.server.RaftServer$Division: a0ba8aa4-5cff-4452-9adf-638135a9ebaf@group-C00705B4E3D4: set new configuration index: 1
      configurationEntry {
        peers {
          id: "6e7bb411-69c5-40ea-8238-bc65cc4e9b55"
          address: "9.134.50.210:9865"
        }
      }
       from snapshot
      2021-04-26 11:23:11,166 [grpc-default-executor-0] DEBUG org.apache.ratis.util.LifeCycle: 6e7bb411-69c5-40ea-8238-bc65cc4e9b55|rpc:9.134.50.210:9865|admin:|client:|dataStream:|priority:0: NEW
      2021-04-26 11:23:11,167 [grpc-default-executor-0] INFO org.apache.ratis.server.RaftServer$Division: a0ba8aa4-5cff-4452-9adf-638135a9ebaf@group-C00705B4E3D4: set configuration 1: [6e7bb411-69c5-40ea-8238-bc65cc4e9b55|rpc:9.134.50.210:9865|admin:|client:|dataStream:|priority:0], old=null
      2021-04-26 11:23:11,171 [grpc-default-executor-0] INFO org.apache.ratis.server.RaftServer$Division: a0ba8aa4-5cff-4452-9adf-638135a9ebaf@group-C00705B4E3D4: reply installSnapshot: 6e7bb411-69c5-40ea-8238-bc65cc4e9b55<-a0ba8aa4-5cff-4452-9adf-638135a9ebaf#0:FAIL-t0,IN_PROGRESS
      2021-04-26 11:23:11,171 [grpc-default-executor-0] DEBUG org.apache.ratis.grpc.server.GrpcServerProtocolService: a0ba8aa4-5cff-4452-9adf-638135a9ebaf: reply 6e7bb411-69c5-40ea-8238-bc65cc4e9b55<-a0ba8aa4-5cff-4452-9adf-638135a9ebaf#0:FAIL-t0,IN_PROGRESS
      2021-04-26 11:23:11,175 [grpc-default-executor-0] INFO org.apache.ratis.grpc.server.GrpcServerProtocolService: a0ba8aa4-5cff-4452-9adf-638135a9ebaf: Completed INSTALL_SNAPSHOT, lastRequest: 6e7bb411-69c5-40ea-8238-bc65cc4e9b55->a0ba8aa4-5cff-4452-9adf-638135a9ebaf#0-t2,notify:(t:2, i:1)
      

       

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            glengeng Glen Geng
            glengeng Glen Geng
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 1h 10m
                1h 10m

                Slack

                  Issue deployment