Details
-
Improvement
-
Status: Resolved
-
Critical
-
Resolution: Fixed
-
None
-
Twitter Mesos Q2 Sprint 6
-
3
Description
We've observed that both implicit and explicit reconciliation are expensive for large numbers of tasks:
Explicit O(100,000) tasks: 70secs
I0625 20:55:23.716320 21937 master.cpp:3863] Performing explicit task state reconciliation for N tasks of framework F (NAME) at S@IP:PORT I0625 20:56:34.812464 21937 master.cpp:5041] Removing task T with resources R of framework F on slave S at slave(1)@IP:PORT (HOST)
Implicit with O(100,000) tasks: 60secs
I0625 20:25:22.310601 21936 master.cpp:3802] Performing implicit task state reconciliation for framework F (NAME) at S@IP:PORT I0625 20:26:23.874528 21921 master.cpp:218] Scheduling shutdown of slave S due to health check timeout
Let's add a benchmark to see if there are any bottlenecks here, and to guide improvements.
Attachments
Attachments
Issue Links
- is blocked by
-
MESOS-2941 Add a benchmark for task reconciliation.
- Accepted