Uploaded image for project: 'Apache Ozone'
  1. Apache Ozone
  2. HDDS-2823 SCM HA Support
  3. HDDS-5090

make Decommission work under SCM HA.

    XMLWordPrintableJSON

    Details

    • Type: Sub-task
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.2.0
    • Component/s: None

      Description

      The problem

      The decommission/maintenance info is saved in memory of SCM, and if SCM is restarted, it relearns this info during re-register of Datanode.

      Only leader SCM handles the decommissionNodes(), recommissionNodes(), startMaintenanceNodes() request, and not replicate these info to follower SCM, thus when failover happens, the new leader SCM will lose this info, since they are saved in memory of previous leader SCM.

      Current status
      If a SCM is restarted, then upon re-registration the datanode will already be in DECOMMISSIONING or ENTERING_MAINTENANCE or IN_MAINTENANCE state. In that case, it needs to be added back into the monitor to track its progress.

      For a registered node, the information stored in SCM is the source of truth. If SCM finds that the opState or opStateExpiryEpoch is different from what it saves in memory, it will send SetNodeOperationalStateCommand to update the Datanode.

      The solution

      leader SCM -hb> DN --hb-> follower SCM

      1, Leader SCM updates PersistedOpState of Datanode via heartbeat. Datanode update OpState in follower SCM via heartbeat.

      2, When follower SCM becomes leader, it calls continueAdminForNode for all datanode, so that the DECOMMISSIONING, ENTERING_MAINTENANCE, IN_MAINTENANCE datanode will be added back to the monitor.

      Disadvantage

      The same as now, if leader SCM records the info, notifies Datanode via heartbeat, but steps down before Datanode notifies follower SCM via heartbeat, that info will be lost in the new leader SCM.

      As discussed with Stephen O'Donnell, we can live with the rare event of a decommission starting and SCM failing over before the state has made it to the DNs.

       

      For details: https://docs.google.com/document/d/1N5PsUuLBGgvkYFQgDumvRZujc-9RcDwoE0SubZcLUzY/edit?usp=sharing

       

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                glengeng Glen Geng
                Reporter:
                glengeng Glen Geng
              • Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: