ZooKeeper
  1. ZooKeeper
  2. ZOOKEEPER-1692

Add support for single member ensemble

    Details

    • Type: Improvement Improvement
    • Status: Open
    • Priority: Minor Minor
    • Resolution: Unresolved
    • Affects Version/s: 3.4.0
    • Fix Version/s: None
    • Component/s: quorum
    • Labels:
      None

      Description

      In the past, we ran into problem where quorum could not be formed multiple times. It take a while to investigate the root cause and fix the problem.

      Our current solution is to make it possible to run a quorum with a single member in it. Unlike standalone mode, it has to run as LeaderZooKeeper server, so that the observers can connect to it.

      This will allow the operator to use this workaround to bring back the ensemble quickly while investigating the problem in background.

      The main problem here is to allow the observers to connect with the leader when the quorum size is reduced to one. We don't want to update the (static) configuration on the observer since it require server restart. We are thinking of allowing the observer to connect to any participant which declared that it is the leader without running the leader election algorithm (because it won't have enough votes).

        Activity

        Hide
        Flavio Junqueira added a comment -

        sounds like the motivation is different, but I'm wondering if this is related to ZOOKEEPER-1691 that was also just created.

        Show
        Flavio Junqueira added a comment - sounds like the motivation is different, but I'm wondering if this is related to ZOOKEEPER-1691 that was also just created.
        Hide
        Thawan Kooburat added a comment -

        There is 2 parts to this JIRA. I think the first part is related to ZOOKEEPER-1691. When the quorum is down to one member, we should be running LeaderZooKeeper instead of stand-alone zookeeper server. The second part is that we should make it possible for the observer to discover the leader without restarting it. This is because our read-only observer (ZOOKEEPER-1607) is still running while the quorum is down.

        Show
        Thawan Kooburat added a comment - There is 2 parts to this JIRA. I think the first part is related to ZOOKEEPER-1691 . When the quorum is down to one member, we should be running LeaderZooKeeper instead of stand-alone zookeeper server. The second part is that we should make it possible for the observer to discover the leader without restarting it. This is because our read-only observer ( ZOOKEEPER-1607 ) is still running while the quorum is down.
        Hide
        Michi Mutsuzaki added a comment -

        From the original description, it sounds like you already have something working for the first part. If that's the case, you can go ahead and close ZOOKEEPER-1691 .

        Thanks!
        --Michi

        Show
        Michi Mutsuzaki added a comment - From the original description, it sounds like you already have something working for the first part. If that's the case, you can go ahead and close ZOOKEEPER-1691 . Thanks! --Michi
        Hide
        Michi Mutsuzaki added a comment -

        ... and it would be cool if you can post the patch for the first part soon so that I can try it out

        Show
        Michi Mutsuzaki added a comment - ... and it would be cool if you can post the patch for the first part soon so that I can try it out
        Hide
        Thawan Kooburat added a comment -

        I have the first part for static configuration, it is just a small patch, but I am not sure if it will work for dynamic configuration or not.

        Show
        Thawan Kooburat added a comment - I have the first part for static configuration, it is just a small patch, but I am not sure if it will work for dynamic configuration or not.
        Hide
        Thawan Kooburat added a comment -

        The current implementation that I have is to add a flag which modifies the behavior of FLE for the observers.

        With this flag, the observer will look at the leader votes (in outofelection set). If there is no conflicting votes from the latest epoch (eg. all of the votes point to the same leader), the observer will try to connect to that leader. This mechanism only kicks in when notification time out reach its max value.

        The idea behind this is to rely on FLE during normal operation and remove the majority voting requirement only when the quorum is down for an extended period.

        I haven't look at how this feature may interact with dynamic configuration though.

        Show
        Thawan Kooburat added a comment - The current implementation that I have is to add a flag which modifies the behavior of FLE for the observers. With this flag, the observer will look at the leader votes (in outofelection set). If there is no conflicting votes from the latest epoch (eg. all of the votes point to the same leader), the observer will try to connect to that leader. This mechanism only kicks in when notification time out reach its max value. The idea behind this is to rely on FLE during normal operation and remove the majority voting requirement only when the quorum is down for an extended period. I haven't look at how this feature may interact with dynamic configuration though.

          People

          • Assignee:
            Thawan Kooburat
            Reporter:
            Thawan Kooburat
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:

              Development