[FLINK-27571] Recognize "less is better" benchmarks in regression detection script - ASF JIRA

Attach files

Attach Screenshot

Voters

Watch issue

Watchers

Create sub-task

Link

Clone

Update Comment Author

Replace String in Comment

Update Comment Visibility

Delete Comments

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: 1.16.0
Fix Version/s: 1.17.0
Component/s: Benchmarks
Labels:
- pull-request-available

Description

Example benchmark:

http://codespeed.dak8s.net:8000/timeline/#/?exe=5&ben=schedulingDownstreamTasks.BATCH&extr=on&quarts=on&equid=off&env=2&revs=200

Proposed solution:

I think #2 is the correct way.
Maybe we can modify the save_jmh_result.py to correctly set the 'units' and the 'lessisbetter' fields of benchmark results. The 'units' is already contained in the jmh result and the 'lessisbetter' can be derived from the mode(false if it is 'thrpt' mode, otherwise true). An example of the jmh result format can be found at https://i.stack.imgur.com/vB3fV.png.
This can fix the web UI as well as the REST result, and then the regression_report.py will be able to identify which benchmarks are "less is better" and treat them differently.