Uploaded image for project: 'ZooKeeper'
  1. ZooKeeper
  2. ZOOKEEPER-4039

accpetedEpoch过大导致对应的节点无法加入集群

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Duplicate
    • 3.5.5
    • None
    • server
    • None

    Description

      leader会在收到过半的节点的accpetedEpoch后会将本身的accpetedEpoch设置为这些节点的最大值加1,但是此时leader宕机会导致leader节点的accpetedEpoch比其他节点大1,然后此节点再重启,再次被选为leader,再次宕机,然后剩下的节点再重新选举一个leader,这个leader的epoch会比原来的leader的accpetedEpoch要小,从而导致原来的节点一直在looking和follower状态切换

       

      复现步骤:

      3个节点,server1,server2,server3

      • 启动server1,server2,然后在下面红点位置停止server1和server2此时server2的对应的accpetedEpoch=1
      • 再启动server1,server2,然后再在下面红点位置停止server1和server2此时server2的对应的accpetedEpoch=2
      • 再启动server1,server3,等server1和server3选举出对应的leader为server3,然后再启动server2,就会一直重复下面的异常

       

       

      Attachments

        1. image-2020-12-28-18-03-58-557.png
          72 kB
          pengfei
        2. image-2020-12-28-18-02-21-563.png
          36 kB
          pengfei
        3. image-2020-12-28-18-01-46-005.png
          36 kB
          pengfei
        4. image-2020-12-28-17-58-11-673.png
          36 kB
          pengfei
        5. image-2020-12-28-17-54-09-661.png
          213 kB
          pengfei

        Issue Links

          Activity

            People

              ztzg Damien Diederen
              pf pengfei
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: