[AURORA-1240] Ignore JobUpdateSettings.maxWaitToInstanceRunningMs in the scheduler - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Task
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 0.8.0
Component/s: Scheduler
Labels:
None

Epic Link:
0.8.0 deprecations

Description

The UpdateConfig restart_theshold [1] setting does not appear to deliver much user value as it's highly sensitive to scheduling performance and may result in aborted/rolled back job updates when set too low.

Some background: This timeout controls task transition from PENDING to RUNNING during the job update. In the event of cluster capacity shortage, assigning a task to a host may take considerably longer thus expiring the timeout and depending on the failure settings causing an unnecessary job update abort or rollback. It was meant to give users some protection against unsatisfiable resource/constraint requirements. In reality though, it proved to be rather an annoyance to users when an update is interrupted due to unexpected delay in task assignment.

Consider deprecating and subsequently removing this setting.

This ticket tracks a first step to ignore this value in the scheduler updater. See linked tickets for follow-up work.

[1] - https://github.com/apache/aurora/blob/master/docs/configuration-reference.md#updateconfig-objects

Attachments

Issue Links

duplicates

AURORA-787 Avoid using maxWaitToInstanceRunningMs for instance killing wait

Resolved

relates to

AURORA-1247 Remove JobUpdateSettings.maxWaitToInstanceRunningMs

Resolved

AURORA-1252 Deprecate UpdateConfig restart_threshold setting

Resolved

AURORA-1253 Warn when JobUpdateSettings.maxWaitToInstanceRunningMs is set

Resolved

Activity

People

Assignee:: Bill Farner

Reporter:: Maxim Khutornenko

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 02/Apr/15 21:05

Updated:: 04/May/15 13:12

Resolved:: 06/Apr/15 23:21