Uploaded image for project: 'Commons Math'
  1. Commons Math
  2. MATH-1152

Suboptimal implementation of EnumeratedDistribution.sample()

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 3.3
    • Fix Version/s: 3.4
    • Labels:
      None

      Description

      org.apache.commons.math3.distribution.EnumeratedDistribution.sample() performs a linear search to find the appropriate element in the probability space (called singletons here) given a random double value. For large probability spaces, this is not effective. Instead, we should cache the cumulative probabilities and do a binary search.

      Rough implementation:

      EnumeratedDistribution.java
      void computeCumulative() {
        cumulative = new double[size]; 
        double sum = 0;
        for (int i = 1; i < weights.length - 1; i++) {
            cumulative[i] = cumulative[i-1] + weights[i-1];
         }
        cumulative[size - 1] = 1;
      }
      

      and then

      EnumeratedDistribution.java
      int sampleIndex() {
       double randomValue = random.nextDouble();
       int result = Arrays.binarySearch(cumulative, randomValue);
       if (result >= 0) return result;
       int insertionPoint = -result-1;
       return insertionPoint;
      }
      
      

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              sandris Andras Sereny
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: