A typical SCM sequence for driving datanodes through upgrade would be
Something like following :
- client sends Finalize
- SCM moves to Finalizing states. If SCM crashes, and it comes up
it will always restart from this state.
- SCM disallows new pipeline creation, SCM in safe mode
(SCM Freeze for new pipeline)
- SCM closes existing pieplines
- SCM updates MLV = SLV if not already so. Update on-disk MLV state.
- SCM moves all data nodes to HEALTHY_READONLY state. Please note that
initial state for all data node is HEALTHY_READONLY. For data nodes
to move from HEALTHY_READONLY -> HEALTHY, they need to send atleast
one heartbeat where DN.MLV == SCM.MLV
- SCM waits for few heartbeats
- SCM allows new pipeline creation (SCM thaw for new pipeline creation).
New Pipelines can be created if enough HEALTHY data nodes are found.
- If SCM comes across any data node heart beat with DN.MLV < SCM.MLV => SCM sends
that data node finalize command
As part of this, we would be introducing a new state HEALTHY-READONLY in DataNode state machine maintained in SCM .
This Jira will be used to make changes in the datanode state machine.