Details
-
Improvement
-
Status: Resolved
-
Not a Priority
-
Resolution: Fixed
-
None
Description
This is a follow up to reactive mode, introduced in FLINK-10407.
Introduce a cooldown timeout, during which no further scaling actions are performed, after a scaling action.
Without such a cooldown timeout, it can happen with unfortunate timing, that we are rescaling the job very frequently, because TaskManagers are not all connecting at the same time.
With the current implementation (1.13), this only applies to scaling up, but this can also apply to scaling down with autoscaling support.
With this implemented, users can define a cooldown timeout of say 5 minutes: If taskmanagers are now slowly connecting one after another, we will only rescale every 5 minutes.
Attachments
Issue Links
- causes
-
FLINK-34272 AdaptiveSchedulerClusterITCase failure due to MiniCluster not running
- Resolved
-
FLINK-33976 AdaptiveScheduler cooldown period is taken from a wrong configuration
- Resolved
- is duplicated by
-
FLINK-32484 AdaptiveScheduler combined restart during scaling out
- Closed
- is related to
-
FLINK-10407 FLIP-159: Reactive mode
- Closed
- links to