Uploaded image for project: 'Commons Math'
  1. Commons Math
  2. MATH-1169

Faster Percentile



    • Type: Improvement
    • Status: Open
    • Priority: Minor
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: 4.X
    • Labels:


      Percentile class is now using Hoare selection algorithm (aka Quickselect) with median of 3 for choosing a pivot and caching pivots. Quickselect expected runtime is about 3.4N + O(N).

      The constant can be improved to 1.5N by different pivot strategy involving sampling, yielding the Floyd–Rivest algorithm, which has average complexity of 1.5N + O(N) for median, with other position statistics even faster.

      For proof of concept I implemented Floyd–Rivest algorithm without caching pivots following Wikipedia and benchmarked it with Caliper.

      Array size - runtime for Floyd-Rivest - runtime for current Percentile - % faster
      100000 - 6.907 - 7.083 - 2.5% link
      1000000 - 70.3 - 75.4 - 6.8% link
      10000000 - 708 - 868 - 18.4% link

      In attachments:

      Main.java should be run to repeat experiments, it needs Caliper JAR: https://code.google.com/p/caliper/downloads/detail?name=caliper-1.0-beta-1-all.jar
      Every call to evaluate() I'm filling the input double[] array with different random numbers.

      FloydRivest.java contatins implementation of algorithm basing on pseudocode from its Wikipedia page.

      PercentileFloydRivest.java is Percentile.java with modified evaluate(), so it calls FloydRivest.select() instead of typical select(). pivotsHeap was removed for simplicity (maybe it can improve efficiency even more).


        1. FloydRivest.java
          3 kB
          Adam Stelmaszczyk
        2. Main.java
          1 kB
          Adam Stelmaszczyk
        3. PercentileFloydRivest.java
          15 kB
          Adam Stelmaszczyk

          Issue Links



              • Assignee:
                adams Adam Stelmaszczyk
              • Votes:
                0 Vote for this issue
                2 Start watching this issue


                • Created: