Details
Description
The full test suite is exceeding the 9 minute mark (581 seconds on my machine), this epic is to track techniques to improve this:
- Now that the master and the slave have to perform sync'ed disk writes, consider using tmpfs (e.g. under /dev/shm) to speed up the disk writes. For the master, we could also consider defaulting to in-memory state rather than the replicated log for most tests.
The reaper takes a full second to reap an exited process (MESOS-1199), this adds a second to each slave recovery test, and possibly more for things that rely on Subprocess.- The command executor sleeps for a second when shutting down (MESOS-442), this adds a second to every test that uses the command executor.
A big improvement will come from running the tests in parallel, a few options:
- Use automake's parallel test harness to compile tests separately and run tests in parallel (see here).
- Continue to use one test binary, but leverage google test's ability to shard tests across processes/machines (see here). This entails writing our own test wrapper script in support to decide many workers to use, etc. gtest-parallel is an example of a parallel runner, but does not leverage the sharding ability.
Attachments
Issue Links
- is blocked by
-
MESOS-4156 Speed up FetcherCacheTest.* and FetcherCacheHttpTest.*
- Resolved
-
MESOS-4157 Speed up ZooKeeper-related tests
- Reviewable
-
MESOS-4158 Speed up SlaveRecoveryTest.*
- Reviewable
-
MESOS-4159 Speed up GroupTest.*
- Reviewable
-
MESOS-4155 Speed up ExamplesTest.*
- Resolved
- is duplicated by
-
MESOS-2059 improve performance of expensive tests
- Resolved
- relates to
-
MESOS-3760 Remove fragile sleep() from ProcessManager::settle()
- Resolved
-
MESOS-1582 Improve build time.
- Accepted
-
MESOS-1199 Subprocess is "slow" -> gated by process::reap poll interval
- Resolved