XMLWordPrintableJSON

    Details

      Description

      DescriptiveStatistics relies on their ResizableDoubleArray for storing double values for their histograms. However, this is constantly resizing an internal array and seems to have quite some overhead.

      Additionally, we're not using SynchronizedDescriptiveStatistics which, according to its docs, we should. Currently, we seem to be somewhat safe because ResizableDoubleArray has some synchronized parts but these are scheduled to go away with commons.math version 4.

      Internal tests with the current implementation, one based on a linear array of twice the histogram size (and moving values back to the start once the window reaches the end), and one using a circular array (wrapping around with flexible start position) has shown these numbers using the optimised code from FLINK-10236, FLINK-12981, and FLINK-12982:

      1. only adding values to the histogram
        Benchmark                                       Mode  Cnt      Score        Error   Units
        HistogramBenchmarks.dropwizardHistogramAdd     thrpt   30   47985.359 ±    25.847  ops/ms
        HistogramBenchmarks.descriptiveHistogramAdd    thrpt   30   70158.792 ±   276.858  ops/ms
        --- with FLINK-10236, FLINK-12981, and FLINK-12982 ---
        HistogramBenchmarks.descriptiveHistogramAdd    thrpt   30   75303.040 ±   475.355  ops/ms
        HistogramBenchmarks.descrHistogramCircularAdd  thrpt   30  200906.902 ±   384.483  ops/ms
        HistogramBenchmarks.descrHistogramLinearAdd    thrpt   30  189788.728 ±   233.283  ops/ms
        
      2. after adding each value, also retrieving a common set of metrics:
        Benchmark                                       Mode  Cnt      Score        Error   Units
        HistogramBenchmarks.dropwizardHistogram        thrpt   30     400.274 ±     4.930  ops/ms
        HistogramBenchmarks.descriptiveHistogram       thrpt   30     124.533 ±     1.060  ops/ms
        --- with FLINK-10236, FLINK-12981, and FLINK-12982 ---
        HistogramBenchmarks.descriptiveHistogram       thrpt   30     251.895 ±     1.809  ops/ms
        HistogramBenchmarks.descrHistogramCircular     thrpt   30     301.068 ±     2.077  ops/ms
        HistogramBenchmarks.descrHistogramLinear       thrpt   30     234.050 ±     5.485  ops/ms
        

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                NicoK Nico Kruber
                Reporter:
                NicoK Nico Kruber
              • Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 20m
                  20m