Uploaded image for project: 'Ratis'
  1. Ratis
  2. RATIS-1482

shouldNotifyToInstallSnapshot returns wrong term index

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • None
    • server
    • None

    Description

      To simulate the snapshot transmission delay,  I added the Thread.sleep in mocked notifyInstallSnapshotFromLeader, the whole  InstallSnapshotNotificationTests failed.

      I captured the log and exported it into HTML as the attachment.

      Here is the main clue.

       

      2022-01-06 15:45:42,070 [s1@group-F75934292722-StateMachineUpdater] ERROR impl.StateMachineUpdater (StateMachineUpdater.java:run(195)) - s1@group-F75934292722-StateMachineUpdater caught a Throwable.
      java.lang.IllegalStateException: org.apache.ratis.util.Preconditions$$Lambda$99/1414973146@69e38d26
      at org.apache.ratis.util.Preconditions.assertTrue(Preconditions.java:45)
      at org.apache.ratis.util.Preconditions.assertNull(Preconditions.java:82)
      at org.apache.ratis.util.Preconditions.assertNull(Preconditions.java:86)
      at org.apache.ratis.statemachine.SimpleStateMachine4Testing.put(SimpleStateMachine4Testing.java:200)
      at org.apache.ratis.statemachine.SimpleStateMachine4Testing.loadSnapshot(SimpleStateMachine4Testing.java:315)
      at org.apache.ratis.statemachine.SimpleStateMachine4Testing.reinitialize(SimpleStateMachine4Testing.java:235)
      at org.apache.ratis.server.impl.StateMachineUpdater.reload(StateMachineUpdater.java:218)
      at org.apache.ratis.server.impl.StateMachineUpdater.run(StateMachineUpdater.java:180)
      at java.lang.Thread.run(Thread.java:812) 
      
      

      The repetitive ``reinitialize`` action is suspicious,  before this error, the install of the snapshot has already been done. Normally, the follower shall receive the following append log request and catch up.

      The err is due to two parts:

      1. the notify install snapshot request is sent repetitively and quite frequently
      2. 2022-01-06 15:45:41,063 [grpc-default-executor-0] INFO server.RaftServer$Division (RaftServerImpl.java:notifyStateMachineToInstallSnapshot(1620)) - s1@group-F75934292722: notifyInstallSnapshot: nextIndex is 255 but the leader's first available index is 258. 

         This line shows that after the snapshot was installed, it still got the notification again and passed the check whether installed the snapshot already. 

       

      Attachments

        Issue Links

          Activity

            People

              Nibiruxu Xu Shao Hong
              Nibiruxu Xu Shao Hong
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 1h
                  1h