Problem Statement

Single State Model v.s. Multiple State Models

Currently, Each Helix resource is associated with a single state model, and each replica of a partition can only be in any one of these states defined in the state model at any time. And Helix manages state transition based on the single state model.

However, in many scenarios, resources could be more complicated to be modeled by a single state model.
As an example, partitions from a resource could be described in different dimensions: SlaveMaster state, Read or Write state and its versions. They represent different dimensions of the overall resource status. States from each dimension are based on different state models. Note that we have state machines simplified in this document.

The basic idea is that states in these 3 dimensions are in parallel and can be changed independently. For instance, R/W state may be changed without updating slave/master state.

Finite State Machine v.s. Dynamic State Model

In addition, Helix employs finite state machine to define a state model. However, some state model can not be easily modeled by a finite state machine with fixed states, for example, the versions. We call such state model as the dynamic state model. It is read, set, and understood by the application. We will need to extend Helix to support such dynamic state model. Note that Helix should not and will not be able to calculate the best possible dynamic states.

The version of a software is one of the best examples to understand dynamic state.

Let's consider one application that is deployed on multiple nodes, which work together as a cluster. The green node works as the master, and all dark blue nodes are slaves. When Admins upgrades the service from 1.0.0 to 1.1.0, they need to ensure upgrading all nodes to the new version and then claim upgrade is done. After the upgrade process, it is important to ensure that all software versions are consistent.

If Helix framework is leveraged to support upgrading the cluster, it will help to simplify application logic and ensure consistency. For instance, the service (cluster) itself is regarded as the resource. And each node is mapped as a partition. Then upgrading is simply a state transition. Admins can check external view for ensuring consistency.
Note that during this version upgrade, the master node is still master node, and slave nodes are still slave nodes. So the version state is parallel to the other states.

Attachments

Activity

People

Assignee:: Jiajun Wang

Reporter:: Jiajun Wang

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 31/May/17 07:09

Updated:: 31/Jan/18 23:39