Uploaded image for project: 'Commons Numbers'
  1. Commons Numbers
  2. NUMBERS-184

Reduce number of operations in Precision.equals using a maxUlps

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Trivial
    • Resolution: Fixed
    • 1.0
    • 1.1
    • core
    • None

    Description

      The Precision class has a method to test if two arguments are equal using a maximum number of representable float values between two arguments.

      This is performed on the IEEE 754 bit layout of the floats. When the two inputs have opposite signs there is a lot of code to compute the distance of the values from the bit representation of +0.0 or -0.0. This is redundant. If the signs are opposite then the distance from the bit representation of 0.0 only requires dropping the sign bit from the bit representation. Here is an extract from the current method:

              final int xInt = Float.floatToRawIntBits(x);
              final int yInt = Float.floatToRawIntBits(y);
      
              final boolean isEqual;
              if (((xInt ^ yInt) & SGN_MASK_FLOAT) == 0) {
                  // number have same sign, there is no risk of overflow
                  isEqual = Math.abs(xInt - yInt) <= maxUlps;
              } else {
                  // number have opposite signs, take care of overflow
                  final int deltaPlus;
                  final int deltaMinus;
                  if (xInt < yInt) {
                      deltaPlus  = yInt - POSITIVE_ZERO_FLOAT_BITS;
                      deltaMinus = xInt - NEGATIVE_ZERO_FLOAT_BITS;
                  } else {
                      deltaPlus  = xInt - POSITIVE_ZERO_FLOAT_BITS;
                      deltaMinus = yInt - NEGATIVE_ZERO_FLOAT_BITS;
                  }            
      
                  if (deltaPlus > maxUlps) {
                      isEqual = false;
                  } else {
                      isEqual = deltaMinus <= (maxUlps - deltaPlus);
                  }        
              }

      The second branch can be simplified using bit masking.

                  final int deltaPlus = xInt & Integer.MAX_VALUE;
                  final int deltaMinus = yInt & Integer.MAX_VALUE;   
                  isEqual = (long) deltaPlus + deltaMinus <= maxUlps;

      For the float method overflow can be avoid by using a long to sum the two deltas eliminating a further branch condition.

      An different optimisation can be performed for the double argument method. Since the ulp argument is an integer, when the signs are opposite then a NaN bit value would be at least (2047L << 52) above zero. Thus there is no need to check for NaN if the numbers are equal within the max ULPs and have opposite signs.

      This optimisation could be made if using a short for the float equals method but would require breaking API changes and cannot be done. For reference the max difference for doubles is approximately 2^31 / 2^52 of the mantissa for double values with the same exponent. This is a relative error of approximately 4.77e-7.

      Using a short for floats would be 2^15 / 2^24 of the mantissa for a relative error of approximately 3.8e-3. Using an int argument allows an extreme relative error of 1 when both arguments are the same sign, and an absolute error of more than Float.MAX_VALUE. It makes no sense to compare two float values with a maximum possible ULP difference of more than the range from zero to infinity.

       

      Attachments

        Activity

          People

            Unassigned Unassigned
            aherbert Alex Herbert
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: