Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Duplicate
-
None
-
None
-
3
Description
The master crashes with a CHECK failure in the following scenario:
- Framework launches task X on agent A1. The framework may or may not be partition-aware; let's assume it is not partition-aware.
- A1 becomes partitioned from the master.
- Framework launches task X on agent A2.
- Master fails over.
- Agents A1 and A2 both re-register with the master. Because the master has failed over, the task on A1 is not terminated ("non-strict registry semantics").
This results in two running tasks with the same ID, which causes a master CHECK failure among other badness:
master.hpp:2299] Check failed: !tasks.contains(task->task_id()) Duplicate task b88153a2-571a-41e7-9e9b-c297fef4f3cd of framework eaef1879-8cc9-412f-928d-86c9925a7abb-0000
Attachments
Attachments
Issue Links
- duplicates
-
MESOS-3070 Master CHECK failure if a framework uses duplicated task id.
- Open
- relates to
-
MESOS-6805 Check unreachable task cache for task ID collisions on launch
- Resolved