Affects Version/s: FUTURE
Fix Version/s: None
Greatly improve (~ x100) the maximum rate of cluster size increase (maximum rate of ascent), when subjected to a sudden increase in load. (continuous scaling decisions can occur as the decision isn’t delayed (cluster monitor interval) to wait for the system to tend toward a steady state)
Eliminate redundant cartridges being spawned/terminated because of cartridge startup/stop being larger than a scaling decision interval (cluster monitor interval)
Measured health statistic -> sent to CEP -> 1 minute average -> forward prediction (use ave + ave grad + ave 2nd grad) -> use autoscale policy to calc number of required cartridges -> compare required cartridge count, to current cartridge found and scale appropriately
Measured health statistic -> sent to CEP -> 1 minute‘moving’average (per second) -> forward prediction (use ave + ave grad + ave 2nd grad) -> use autoscale policy to calc number of required cartridges -> compare required cartridge count, to ( the active (current) cartridge count + the spawning cartridge count – the terminating cartridge count ) and scale appropriately
Implementation of ‘spawning’/‘terminating’ cartridge count:
Currently the autoscale feature is not aware of the amount of cartridges in the cluster that are transitioning to and from the ACTIVE state. The proposed enhancement relies on being able to know this count at any given moment in time.
This can be implemented by using asynchronous events, where -
‘MEMBER SPAWNED EVENT’ -> increments cluster-cartridge-count-spawned
‘MEMBER ACTIVE EVENT’ -> decrements cluster-cartridge-count-spawned, and increments cluster-cartridge-count-active
‘MEMBER TERMINATING EVENT’ -> increments cluster-cartridge-count-terminating
‘MEMBER TERMINATED EVENT’ -> decrements cluster-cartridge-count-terminating, and decrements cluster-cartridge-count-active
By compensating the ‘current’ cartridge count/ cluster size, with the cartridges that are transitioning, we remove the issue of duplicating scaling decisions whilst also allowing the scaling decision to occur continuously, greatly improving our ‘maximum rate of ascent’ when scaling up our cluster in reaction to a sudden increase in load.