[FLINK-27571] Recognize "less is better" benchmarks in regression detection script - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: 1.16.0
Fix Version/s: 1.17.0
Component/s: Benchmarks
Labels:
- pull-request-available

Description

Example benchmark:

http://codespeed.dak8s.net:8000/timeline/#/?exe=5&ben=schedulingDownstreamTasks.BATCH&extr=on&quarts=on&equid=off&env=2&revs=200

Proposed solution:

I think #2 is the correct way.
Maybe we can modify the save_jmh_result.py to correctly set the 'units' and the 'lessisbetter' fields of benchmark results. The 'units' is already contained in the jmh result and the 'lessisbetter' can be derived from the mode(false if it is 'thrpt' mode, otherwise true). An example of the jmh result format can be found at https://i.stack.imgur.com/vB3fV.png.
This can fix the web UI as well as the REST result, and then the regression_report.py will be able to identify which benchmarks are "less is better" and treat them differently.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

image-2022-12-29-14-39-59-976.png
29/Dec/22 06:40
802 kB
Yanfei Lei
Screenshot_2022-05-09_10-33-11.png
11/May/22 08:22
104 kB
Roman Khachatryan

Issue Links

is a child of

FLINK-29825 Improve benchmark stability

Resolved

is a clone of

FLINK-27555 Performance regression in schedulingDownstreamTasks on 02.05.2022

Closed

links to

GitHub Pull Request #55

GitHub Pull Request #63

Activity

People

Assignee:: Yanfei Lei

Reporter:: Roman Khachatryan

Votes:: 0 Vote for this issue

Watchers:: 7 Start watching this issue

Dates

Created:: 11/May/22 08:22

Updated:: 08/Feb/23 10:45

Resolved:: 08/Feb/23 10:45