We should consider improving how benchmarks report their results.
As an example, consider SlaveAndFrameworkCount/HierarchicalAllocator_BENCHMARK_Test.DeclineOffers/1. It logs lines like
I believe there are a number of usability issues with this output format
- lines with benchmark data need to be grep'd from the test log depending on some test-dependent format
- test parameters need to be manually inferred from the test name
- no consistent time unit is used throughout, but instead Duration values are just pretty printed
This makes it hard to consume this results in a generic way (e.g., for plotting, comparison, etc.) as to do that one likely needs to implement a custom log parser (for each test).
We should consider introducing a generic way to log results from tests which requires minimal intervention.
One possible output format could be JSON as it allows to combine heterogeneous data like in above example (which might be harder to do in CSV). There exists a number of standard tools which can be used to filter JSON data; it can also be read by many data analysis tools (e.g., pandas). Example for above data:
Such data could be logged at the end of the test execution with a clear prefix to allow aggregating data from many benchmark results in a single log file with tools like grep. We could provide that in addition to what is already logged (which might be generated by the same tool).