Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-27571

Recognize "less is better" benchmarks in regression detection script

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    Description

      Example benchmark:

      http://codespeed.dak8s.net:8000/timeline/#/?exe=5&ben=schedulingDownstreamTasks.BATCH&extr=on&quarts=on&equid=off&env=2&revs=200

       

      Proposed solution:

      I think #2 is the correct way.
      Maybe we can modify the save_jmh_result.py to correctly set the 'units' and the 'lessisbetter' fields of benchmark results. The 'units' is already contained in the jmh result and the 'lessisbetter' can be derived from the mode(false if it is 'thrpt' mode, otherwise true). An example of the jmh result format can be found at https://i.stack.imgur.com/vB3fV.png.
      This can fix the web UI as well as the REST result, and then the regression_report.py will be able to identify which benchmarks are "less is better" and treat them differently.

       

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Yanfei Lei Yanfei Lei
            roman Roman Khachatryan
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment