Uploaded image for project: 'ZooKeeper'
  1. ZooKeeper
  2. ZOOKEEPER-4039

accpetedEpoch过大导致对应的节点无法加入集群

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Duplicate
    • Affects Version/s: 3.5.5
    • Fix Version/s: None
    • Component/s: server
    • Labels:
      None

      Description

      leader会在收到过半的节点的accpetedEpoch后会将本身的accpetedEpoch设置为这些节点的最大值加1,但是此时leader宕机会导致leader节点的accpetedEpoch比其他节点大1,然后此节点再重启,再次被选为leader,再次宕机,然后剩下的节点再重新选举一个leader,这个leader的epoch会比原来的leader的accpetedEpoch要小,从而导致原来的节点一直在looking和follower状态切换

       

      复现步骤:

      3个节点,server1,server2,server3

      • 启动server1,server2,然后在下面红点位置停止server1和server2此时server2的对应的accpetedEpoch=1
      • 再启动server1,server2,然后再在下面红点位置停止server1和server2此时server2的对应的accpetedEpoch=2
      • 再启动server1,server3,等server1和server3选举出对应的leader为server3,然后再启动server2,就会一直重复下面的异常

       

       

        Attachments

        1. image-2020-12-28-17-54-09-661.png
          213 kB
          pengfei
        2. image-2020-12-28-17-58-11-673.png
          36 kB
          pengfei
        3. image-2020-12-28-18-01-46-005.png
          36 kB
          pengfei
        4. image-2020-12-28-18-02-21-563.png
          36 kB
          pengfei
        5. image-2020-12-28-18-03-58-557.png
          72 kB
          pengfei

          Issue Links

            Activity

              People

              • Assignee:
                ztzg Damien Diederen
                Reporter:
                pf pengfei
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: