Description
The following paper provides a method for sampling from a discrete probability distribution that requires a single 30-bit unisigned integer per sample:
George Marsaglia, Wai Wan Tsang, Jingbo Wang (2004) Fast Generation of Discrete Random Variables. Journal of Statistical Software. Vol. 11, Issue. 3, pp. 1-11.
The paper gives a general method that requires input probabilities expressed as unsigned 30 bit integers. The probabilities thus have an assumed divisor of 2^30^. The probabilities must sum to 2^30^. These are then efficiently tabulated for fast look-up using 5 base-64 digits from the 30-bit number. Sampling then takes 1 random integer per sample. No samples can be obtained for any observation where the probability is < 1^-32^.
The paper provides software to compute the probability distributions for Poisson and Binomial and the look-up tables.
It is applicable to any distribution by normalising to 2^30^:
// Compute the normalisation: 2^30 / sum // (assuming sum has been precomputed) final double normalisation = (1 << 30) / sum; final int[] prob = new int[probabilities.length]; int sum = 0; int max = 0; int mode = 0; for (int i = 0; i < prob.length; i++) { // Add 0.5 for rounding final int p = (int) (probabilities[i] * normalisation + 0.5); sum += p; // Find the mode (maximum probability) if (max < p) { max = p; mode = i; } prob[i] = p; } // The sum must be >= 2^30. // Here just compensate the difference onto the highest probability. prob[mode] += (1 << 30) - sum;
Attachments
Issue Links
- is related to
-
RNG-91 Kemp small mean poisson sampler
- Closed
- links to