Uploaded image for project: 'Ratis'
  1. Ratis
  2. RATIS-1247 Support rolling upgrade and rollback
  3. RATIS-1796

Fix TransferLeadership stopped by heartbeat from old leader

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 3.0.0, 2.5.0
    • server
    • None

    Description

      Candidate state of transferee may be stopped by the appendEntries from old leader, see the log below

      2023-02-28 04:52:45,026 [s0-server-thread1] INFO  impl.TransferLeadership (TransferLeadership.java:tryTransferLeadership(107)) - s0@group-43918D205BB2: start transferring leadership to s1
      2023-02-28 04:52:45,029 [s0-server-thread1] INFO  impl.TransferLeadership (TransferLeadership.java:tryTransferLeadership(116)) - s0@group-43918D205BB2: sent StartLeaderElection to transferee s1 immediately as it already has up-to-date log
      2023-02-28 04:52:45,031 [grpc-default-executor-6] INFO  impl.RoleInfo (RoleInfo.java:shutdownFollowerState(111)) - s1: shutdown s1@group-43918D205BB2-FollowerState
      2023-02-28 04:52:45,032 [s1@group-43918D205BB2-FollowerState] INFO  impl.FollowerState (FollowerState.java:run(152)) - s1@group-43918D205BB2-FollowerState was interrupted
      2023-02-28 04:52:45,032 [grpc-default-executor-6] INFO  impl.RoleInfo (RoleInfo.java:updateAndGet(140)) - s1: start s1@group-43918D205BB2-LeaderElection4
      2023-02-28 04:52:45,054 [s1-server-thread1] INFO  impl.RoleInfo (RoleInfo.java:shutdownLeaderElection(131)) - s1: shutdown s1@group-43918D205BB2-LeaderElection4
      2023-02-28 04:52:45,054 [s1-server-thread1] INFO  impl.RoleInfo (RoleInfo.java:startFollowerState(104)) - s1: startFollowerState reason:appendEntries from s0 term 1,
        trace: java.base/java.lang.Thread.getStackTrace(Thread.java:1602),
          org.apache.ratis.server.impl.RoleInfo.startFollowerState(RoleInfo.java:104),
          org.apache.ratis.server.impl.RaftServerImpl.changeToFollower(RaftServerImpl.java:547),
          org.apache.ratis.server.impl.RaftServerImpl.changeToFollowerAndPersistMetadata(RaftServerImpl.java:556),
          org.apache.ratis.server.impl.RaftServerImpl.appendEntriesAsync(RaftServerImpl.java:1498),
          org.apache.ratis.server.impl.RaftServerImpl.appendEntriesAsync(RaftServerImpl.java:1396),
          org.apache.ratis.server.impl.RaftServerProxy.lambda$appendEntriesAsync$26(RaftServerProxy.java:639),
          org.apache.ratis.util.JavaUtils.callAsUnchecked(JavaUtils.java:117),
          org.apache.ratis.server.impl.RaftServerImpl.lambda$executeSubmitServerRequestAsync$11(RaftServerImpl.java:818),
          java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1700),
          java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128),
          java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628),
          java.base/java.lang.Thread.run(Thread.java:829)
      2023-02-28 04:52:45,055 [s1-server-thread1] INFO  impl.RoleInfo (RoleInfo.java:updateAndGet(140)) - s1: start s1@group-43918D205BB2-FollowerState
      

      Update: print isHeartbeat in log

      2023-03-07 02:28:54,806 [s1-server-thread4] INFO  impl.RoleInfo (RoleInfo.java:startFollowerState(104)) - s1: startFollowerState reason:appendEntries: s0, term: 1, isHeartbeat: true,
        trace:
          java.base/java.lang.Thread.getStackTrace(Thread.java:1602),
          org.apache.ratis.server.impl.RoleInfo.startFollowerState(RoleInfo.java:104),
          org.apache.ratis.server.impl.RaftServerImpl.changeToFollower(RaftServerImpl.java:547),
          org.apache.ratis.server.impl.RaftServerImpl.changeToFollowerAndPersistMetadata(RaftServerImpl.java:556),
          org.apache.ratis.server.impl.RaftServerImpl.appendEntriesAsync(RaftServerImpl.java:1498),
          org.apache.ratis.server.impl.RaftServerImpl.appendEntriesAsync(RaftServerImpl.java:1396),
          org.apache.ratis.server.impl.RaftServerProxy.lambda$appendEntriesAsync$26(RaftServerProxy.java:639),
          org.apache.ratis.util.JavaUtils.callAsUnchecked(JavaUtils.java:117),
          org.apache.ratis.server.impl.RaftServerImpl.lambda$executeSubmitServerRequestAsync$11(RaftServerImpl.java:818),
          java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1700),
          java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128),
          java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628),
          java.base/java.lang.Thread.run(Thread.java:829)
      

      Attachments

        Activity

          People

            ckj Kaijie Chen
            ckj Kaijie Chen
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 1h 40m
                1h 40m