Uploaded image for project: 'Apache Ozone'
  1. Apache Ozone
  2. HDDS-4330 Bootstrap new OM node
  3. HDDS-4775

Avoid OM split brain going from 1 node OM to 3 node ratis without bootstrap

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None
    • None

    Description

      The expected flow to add more OMs to a single node OM ratis cluster is to adjust the configurations to add the new OMs, and then start the new OMs in bootstrap mode (supply the -–bootstrap flag on startup) so that they will get all transaction history from the original OM. However, it is possible for the user to mistakenly adjust the configs and not start the new OMs with the --bootstrap command. This will cause the two new OMs to form their own Ratis ring with a 2/3 majority that can service write requests, while the original OM is still the leader of a single node Ratis ring and also servicing write requests. This leads to a split brain scenario that is difficult to detect without inspecting the logs, because all OMs appear functional and the cluster is writeable.

      This Jira aims to detect such a scenario and shut down the OMs when it occurs, instructing the user to bootstrap the new OMs on startup instead.

      Attachments

        Activity

          People

            Unassigned Unassigned
            erose Ethan Rose
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: