Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
0.24.0
-
Twitter Mesos Q3 Sprint 5
-
3
Description
Currently the perf event isolator times out a sample after a fixed extra time of 2 seconds on top of the sample time elapses:
Duration timeout = flags.perf_duration + Seconds(2);
This should be based on the reap interval maximum.
Also, the code stops sampling altogether when a single timeout occurs. We've observed time outs during normal operation, so it would be better for the isolator to continue performing perf sampling in the case of timeouts. It may also make sense to continue sampling in the case of errors, since these may be transient.