[MESOS-10118] Agent incorrectly handles draining when empty - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 1.9.0
Fix Version/s: 1.9.1, 1.10.0
Component/s: agent
Labels:
None

Target Version/s:

1.9.1, 1.10.0
Epic Link:
Agent Draining

Description

When the agent receives a DrainSlaveMessage and does not have any tasks or operations, it writes the DrainConfig to disk and is then implicitly stuck in a "draining" state indefinitely. For example, if an agent reregistration is triggered at such a time, the master may think the agent is operating normally and send a task to it, at which point the task will fail because the agent thinks it's draining (see this test for an example: https://reviews.apache.org/r/72364/).

If the agent receives a DrainSlaveMessage when it has no tasks or operations, it should avoid writing any DrainConfig to disk so that it immediately "transitions" into the already-drained state.

Attachments

Issue Links

relates to

MESOS-10116 Attempt to reactivate disconnected agent crashes the master

Resolved

Activity

People

Assignee:: Greg Mann

Reporter:: Greg Mann

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 15/Apr/20 21:22

Updated:: 07/May/20 01:08

Resolved:: 07/May/20 00:29