Description
The load of streaming jobs usually fluctuate according to the input rate or operations (e.g., window). Supporting the automatic scaling could reduce the operational cost of running streaming applications, while minimizing the performance degradation that can be caused by the bursty loads.
We can harness the cloud resources such as VMs and serverless frameworks to acquire computing resources on demand. To realize the automatic scaling, the following features should be implemented.
1) state migration: scaling jobs require moving tasks (or partitioning a task to multiple ones). In this situation, the internal state of the task should be serialized/deserialized.
2) input/output rerouting: if a task is moved to a new worker, the input and output of the task should be redirected.
3) dynamic Executor or Task creation/deletion: Executor}}s or {{Task can be dynamically created or deleted.
4) scaling policy: a scaling policy that decides when and how to scale out/in should be implemented.