Uploaded image for project: 'Apache Ozone'
  1. Apache Ozone
  2. HDDS-5062

Add a config to bypass clusterId validation for bootstrapping SCM

    XMLWordPrintableJSON

Details

    Description

      IN SCM HA, the primary node starts up the ratis server while other bootstrapping nodes will get added to the ratis group. Now, if all the bootstrapping SCM's get stopped, the primary node will now step down from leadership as it will loose majority. If the bootstrapping nodes are now bootstrapped again,  the bootsrapping node will try to first validate the cluster id from the leader SCM with the persisted cluster id , but as there is no leader existing, bootstrapping wil keep on failing and retrying until it shuts down. 

      The issue can be very easily simulated in kubernetes deployments, where bootstrap and init cmds are run repeatedly on every restart.

      The Jira aims to bypass the cluster id validation if a bootstrapping node already has a cluster id.

      Attachments

        Issue Links

          Activity

            People

              shashikant Shashikant Banerjee
              shashikant Shashikant Banerjee
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: