Details
-
Improvement
-
Status: Accepted
-
Minor
-
Resolution: Unresolved
-
None
-
None
Description
We should consider improving how benchmarks report their results.
As an example, consider SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.DeclineOffers/1. It logs lines like
[==========] Running 10 tests from 1 test case. [----------] Global test environment set-up. [----------] 10 tests from SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test [ RUN ] SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.DeclineOffers/0 Using 1000 agents and 1 frameworks Added 1 frameworks in 526091ns Added 1000 agents in 61.116343ms round 0 allocate() took 14.70722ms to make 0 offers after filtering 1000 offers round 1 allocate() took 15.055396ms to make 0 offers after filtering 1000 offers [ OK ] SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.DeclineOffers/0 (135 ms) [ RUN ] SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.DeclineOffers/1
I believe there are a number of usability issues with this output format
- lines with benchmark data need to be grep'd from the test log depending on some test-dependent format
- test parameters need to be manually inferred from the test name
- no consistent time unit is used throughout, but instead Duration values are just pretty printed
This makes it hard to consume this results in a generic way (e.g., for plotting, comparison, etc.) as to do that one likely needs to implement a custom log parser (for each test).
We should consider introducing a generic way to log results from tests which requires minimal intervention.
One possible output format could be JSON as it allows to combine heterogeneous data like in above example (which might be harder to do in CSV). There exists a number of standard tools which can be used to filter JSON data; it can also be read by many data analysis tools (e.g., pandas). Example for above data:
{ "case": "SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test", "test": "DeclineOffers/0", "parameters": [1000, 1], "benchmarks": { "add_agents": [61.116343], "add_frameworks": [0.0526091], "allocate": [ {"round": 0, "time": 14.70722, "offers": 0, "filtering": 1000}, {"round": 1, "time": 15.055396, "offers": 0, "filtering": 1000} ] } }
Such data could be logged at the end of the test execution with a clear prefix to allow aggregating data from many benchmark results in a single log file with tools like grep. We could provide that in addition to what is already logged (which might be generated by the same tool).
Attachments
Issue Links
- is related to
-
MESOS-4559 Run benchmark tests in ASF CI
- Open