Details

    • Sub-task
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • Public beta
    • 1.7.0
    • consensus
    • None

    Description

      Relative to the re-replication support outlined in KUDU-1096, we can do better in terms of availability properties. Here is a rough outline of such a design.

      Design:

      1. When a voter falls behind the leader's log GC threshold, the leader notifies the Master that the voter is no longer up to date.
      2. The Master selects a node to act as a replacement. It adds that node as a PRE_VOTER to the config (see KUDU-869) and when that node is caught up, it is automatically promoted to a VOTER.
      3. When the Master detects that the node has been promoted, it removes the bad node from the config.

      Additional cases to detect and handle:

      • If the config is in such a state that it would be impossible to add a node, due to a voter that has fallen behind the log GC threshold being in the required majority, then remotely bootstrap that voter without changing the config. The tablet will continue to be unable to serve writes during this time, but will self-heal without administrator intervention.

      This can be further improved by adding support for aborting a config-change operation that cannot commit.

      This requires some additional plumbing from the leader to the Master to notify it of slow followers.

      Pros:

      • Closer to optimal fault-tolerance properties; "majority lost" less likely to occur so administrator intervention less likely

      Cons:

      • Requires support for pre-voter and a smarter master.

      Attachments

        Issue Links

          Activity

            People

              mpercy Mike Percy
              mpercy Mike Percy
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: