Attach filesAttach ScreenshotVotersWatch issueWatchersLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Won't Do
    • None
    • n/a
    • consensus
    • None

    Description

      In certain scenarios it is desirable for replicas that do not exist on a tablet server to be able to vote. After the implementation of KUDU-871, tombstoned tablets are now able to vote. However, there are circumstances (at least in a pre- KUDU-1097 world) where voters that do not have a copy of a replica (running or tombstoned) would be needed to vote to ensure availability in certain edge-case failure scenarios.

      The quick justification for why it would be safe for a non-existent replica to vote is that it would be equivalent to a replica that has simply not yet replicated any WAL entries, in which case it would be legal to vote for any candidate. Of course, a candidate would only ask such a replica to vote for it if it believed that replica to be a voter in its config.

      Some additional discussion can be found here: https://github.com/apache/kudu/blob/master/docs/design-docs/raft-tablet-copy.md#should-a-server-be-allowed-to-vote-if-it-does_not_exist-or-is-deleted

      What follows is an example of a scenario where "non-existent" replicas being able to vote would be desired:

      In a 3-2-3 re-replication paradigm, the leader (A) of a 3-replica config {A, B, C} evicts one replica (C). Then, the leader (A) adds a new voter (D). Before A is able to replicate this config change to B or D, A is partitioned from a network perspective. However A writes this config change to its local WAL. After this, the entire cluster is brought down, the network is restored, and the entire cluster is restarted. However, B fails to come back online due to a hardware failure.

      The only way to automatically recover in this scenario is to allow D, which has no concept of the tablet being discussed, to vote for A to become leader, which will then tablet copy to D and make the tablet available for writes.

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned
            mpercy Mike Percy
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment