Uploaded image for project: 'ZooKeeper'
  1. ZooKeeper
  2. ZOOKEEPER-2172

Cluster crashes when reconfig a new node as a participant

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 3.5.0
    • Fix Version/s: 3.5.3, 3.6.0
    • Component/s: leaderElection, quorum, server
    • Labels:
      None
    • Environment:

      Ubuntu 12.04 + java 7

    • Hadoop Flags:
      Reviewed

      Description

      The operations are quite simple: start three zk servers one by one, then reconfig the cluster to add the new one as a participant. When I add the third one, the zk cluster may enter a weird state and cannot recover.

      I found “2015-04-20 12:53:48,236 [myid:1] - INFO [ProcessThread(sid:1 cport:-1)::PrepRequestProcessor@547] - Incremental reconfig” in node-1 log. So the first node received the reconfig cmd at 12:53:48. Latter, it logged “2015-04-20 12:53:52,230 [myid:1] - ERROR [LearnerHandler-/10.0.0.2:55890:LearnerHandler@580] - Unexpected exception causing shutdown while sock still open” and “2015-04-20 12:53:52,231 [myid:1] - WARN [LearnerHandler-/10.0.0.2:55890:LearnerHandler@595] - ******* GOODBYE /10.0.0.2:55890 ********”. From then on, the first node and second node rejected all client connections and the third node didn’t join the cluster as a participant. The whole cluster was done.

      When the problem happened, all three nodes just used the same dynamic config file zoo.cfg.dynamic.10000005d which only contained the first two nodes. But there was another unused dynamic config file in node-1 directory zoo.cfg.dynamic.next which already contained three nodes.

      When I extended the waiting time between starting the third node and reconfiguring the cluster, the problem didn’t show again. So it should be a race condition problem.

        Attachments

        1. node-1.log
          171 kB
          Ziyou Wang
        2. node-2.log
          108 kB
          Ziyou Wang
        3. node-3.log
          20 kB
          Ziyou Wang
        4. zoo.cfg.dynamic.10000005d
          0.2 kB
          Ziyou Wang
        5. zoo.cfg.dynamic.next
          0.2 kB
          Ziyou Wang
        6. zookeeper-1.log
          1.31 MB
          Ziyou Wang
        7. zookeeper-2.log
          253 kB
          Ziyou Wang
        8. zookeeper-3.log
          62 kB
          Ziyou Wang
        9. zoo-1.log
          1.55 MB
          Ziyou Wang
        10. zoo-2.log
          495 kB
          Ziyou Wang
        11. zoo-3.log
          57 kB
          Ziyou Wang
        12. zoo-2-1.log
          1.60 MB
          Ziyou Wang
        13. zoo-2-2.log
          435 kB
          Ziyou Wang
        14. zoo-2-3.log
          60 kB
          Ziyou Wang
        15. zoo-3-1.log
          1.76 MB
          Ziyou Wang
        16. zoo-3-2.log
          860 kB
          Ziyou Wang
        17. zoo-3-3.log
          68 kB
          Ziyou Wang
        18. zoo-2212-1.log
          3.92 MB
          Ziyou Wang
        19. zoo-2212-2.log
          3.10 MB
          Ziyou Wang
        20. zoo-2212-3.log
          120 kB
          Ziyou Wang
        21. zoo-4-1.log
          1.39 MB
          Ziyou Wang
        22. zoo-4-2.log
          440 kB
          Ziyou Wang
        23. zoo-4-3.log
          60 kB
          Ziyou Wang
        24. history.txt
          63 kB
          Hitoshi Mitake
        25. zookeeper-1.out
          51 kB
          Hitoshi Mitake
        26. zookeeper-2.out
          33 kB
          Hitoshi Mitake
        27. zookeeper-3.out
          59 kB
          Hitoshi Mitake
        28. ZOOKEEPER-2172.patch
          0.7 kB
          Hitoshi Mitake
        29. ZOOKEEPER-2172-02.patch
          2 kB
          Mohammad Arshad
        30. ZOOKEEPER-2172-03.patch
          12 kB
          Mohammad Arshad
        31. ZOOKEEPER-2172-04.patch
          13 kB
          Mohammad Arshad
        32. ZOOKEPER-2172-05.patch
          16 kB
          Alexander Shraer
        33. ZOOKEEPER-2172-06.patch
          14 kB
          Alexander Shraer
        34. ZOOKEEPER-2172-07.patch
          13 kB
          Mohammad Arshad

          Issue Links

            Activity

              People

              • Assignee:
                arshad.mohammad Mohammad Arshad
                Reporter:
                ziyouw Ziyou Wang
              • Votes:
                0 Vote for this issue
                Watchers:
                15 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: