Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-467

Use Float.floatToRawIntBits over Float.floatToIntBits

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 1.9
    • Fix Version/s: None
    • Component/s: core/other
    • Labels:
      None

      Description

      Copied From my Email:
      Float.floatToRawIntBits (in Java1.4) gives the raw float bits without
      normalization (like (int)&floatvar would in C). Since it doesn't do
      normalization of NaN values, it's faster (and hopefully optimized to a
      simple inline machine instruction by the JVM).

      On my Pentium4, using floatToRawIntBits is over 5 times as fast as
      floatToIntBits.
      That can really add up in something like Similarity.floatToByte() for
      encoding norms, especially if used as a way to compress an array of
      float during query time as suggested by Doug.

        Activity

        Hide
        yseeley@gmail.com Yonik Seeley added a comment -

        Paul Smith's profiling shows that that encodeNorm() taking 20% of the total indexing time, with floatToIntBits registering all of that 20%! almost hard to believe...

        There should be some good gains with this change.
        It would be nice to change the usage in Query.hashCode too.

        Show
        yseeley@gmail.com Yonik Seeley added a comment - Paul Smith's profiling shows that that encodeNorm() taking 20% of the total indexing time, with floatToIntBits registering all of that 20%! almost hard to believe... There should be some good gains with this change. It would be nice to change the usage in Query.hashCode too.
        Hide
        yseeley@gmail.com Yonik Seeley added a comment -

        With -server mode, it's only 3 times as fast, and both are really fairly fast.
        I do wonder if the profiler had it's numbers right, or if the act of observation messed things up... 20% seems too high.

        Show
        yseeley@gmail.com Yonik Seeley added a comment - With -server mode, it's only 3 times as fast, and both are really fairly fast. I do wonder if the profiler had it's numbers right, or if the act of observation messed things up... 20% seems too high.
        Hide
        psmith@apache.org Paul Smith added a comment -

        I probably didn't make my testing framework as clear as I should. Yourkit was setup to use method sampling (waking up every X milliseconds). I wouldn't use the 20% as a 'accurate' figure but suffice to say that improving this method would 'certainly' improve things. Only testing the way you have will flush out the correct numbers.

        We don't use -server (due to some Linux vagaries we've been careful with -server because of some stability problems)

        Show
        psmith@apache.org Paul Smith added a comment - I probably didn't make my testing framework as clear as I should. Yourkit was setup to use method sampling (waking up every X milliseconds). I wouldn't use the 20% as a 'accurate' figure but suffice to say that improving this method would 'certainly' improve things. Only testing the way you have will flush out the correct numbers. We don't use -server (due to some Linux vagaries we've been careful with -server because of some stability problems)
        Hide
        yseeley@gmail.com Yonik Seeley added a comment -

        Fun with premature optimization!
        I know this isn't a bottleneck, but here is the fastest floatToByte() that I could come up with:

        public static byte floatToByte(float f)

        { int bits = Float.floatToRawIntBits(f); if (bits<=0) return 0; int mantissa = (bits & 0xffffff) >> 21; int exponent = (bits >>> 24) - 63 + 15; if ((exponent & ~0x1f)==0) return (byte)((exponent << 3) | mantissa); else if (exponent<0) return 1; return -1; }

        Here is the original from Lucene for reference:

        public static byte floatToByte(float f) {
        if (f < 0.0f) // round negatives up to zero
        f = 0.0f;

        if (f == 0.0f) // zero is a special case
        return 0;

        int bits = Float.floatToIntBits(f); // parse float into parts
        int mantissa = (bits & 0xffffff) >> 21;
        int exponent = (((bits >> 24) & 0x7f) - 63) + 15;
        if (exponent > 31)

        { // overflow: use max value exponent = 31; mantissa = 7; }

        if (exponent < 0)

        { // underflow: use min value exponent = 0; mantissa = 1; }

        return (byte)((exponent << 3) | mantissa); // pack into a byte
        }

        Here is the performance (in seconds) on my P4 to do 640M conversions:

        JDK14-server JDK14-client JDK15-server JDK15-client JDK16-server JDK16-client
        orig 75.422 89.451 8.344 57.631 7.656 57.984
        new 67.265 78.891 5.906 22.172 5.172 18.750
        diff 12% 13% 41% 160% 48% 209%

        Some decent gains... but the biggest moral of the story is: use Java>=1.5 and -server if you can!

        Show
        yseeley@gmail.com Yonik Seeley added a comment - Fun with premature optimization! I know this isn't a bottleneck, but here is the fastest floatToByte() that I could come up with: public static byte floatToByte(float f) { int bits = Float.floatToRawIntBits(f); if (bits<=0) return 0; int mantissa = (bits & 0xffffff) >> 21; int exponent = (bits >>> 24) - 63 + 15; if ((exponent & ~0x1f)==0) return (byte)((exponent << 3) | mantissa); else if (exponent<0) return 1; return -1; } Here is the original from Lucene for reference: public static byte floatToByte(float f) { if (f < 0.0f) // round negatives up to zero f = 0.0f; if (f == 0.0f) // zero is a special case return 0; int bits = Float.floatToIntBits(f); // parse float into parts int mantissa = (bits & 0xffffff) >> 21; int exponent = (((bits >> 24) & 0x7f) - 63) + 15; if (exponent > 31) { // overflow: use max value exponent = 31; mantissa = 7; } if (exponent < 0) { // underflow: use min value exponent = 0; mantissa = 1; } return (byte)((exponent << 3) | mantissa); // pack into a byte } Here is the performance (in seconds) on my P4 to do 640M conversions: JDK14-server JDK14-client JDK15-server JDK15-client JDK16-server JDK16-client orig 75.422 89.451 8.344 57.631 7.656 57.984 new 67.265 78.891 5.906 22.172 5.172 18.750 diff 12% 13% 41% 160% 48% 209% Some decent gains... but the biggest moral of the story is: use Java>=1.5 and -server if you can!
        Hide
        yseeley@gmail.com Yonik Seeley added a comment -

        Here is a new version that's faster by keeping the mantissa and exponent
        together. It's fast-path does a single shift and a single add after
        getting the float bits.

        public byte floatToByte(float f) {
        int bits = Float.floatToRawIntBits(f);
        int smallfloat = bits >> 21; // only keep 3 highest bits in mantissa
        if (smallfloat < (63-15)<<3)

        { return (bits<=0) ? (byte)0 : (byte)1; // 0 or underflow }

        if (smallfloat >= ((63-15)+32)<<3)

        { return -1; // overflow }

        return (byte)(smallfloat - ((63-15)<<3));
        }

        --JVM-- CUR- NEW- DIFF
        14-server 75.422 66.515 13%
        14-client 89.451 79.734 12%
        15-server 8.344 3.859 116%
        15-client 57.631 17.031 238%
        16-server 7.656 3.172 141%
        16-client 57.984 16.531 251%

        These numbers include the overhead of a float loop and the method
        call overhead.

        Show
        yseeley@gmail.com Yonik Seeley added a comment - Here is a new version that's faster by keeping the mantissa and exponent together. It's fast-path does a single shift and a single add after getting the float bits. public byte floatToByte(float f) { int bits = Float.floatToRawIntBits(f); int smallfloat = bits >> 21; // only keep 3 highest bits in mantissa if (smallfloat < (63-15)<<3) { return (bits<=0) ? (byte)0 : (byte)1; // 0 or underflow } if (smallfloat >= ((63-15)+32)<<3) { return -1; // overflow } return (byte)(smallfloat - ((63-15)<<3)); } -- JVM -- CUR - NEW - DIFF 14-server 75.422 66.515 13% 14-client 89.451 79.734 12% 15-server 8.344 3.859 116% 15-client 57.631 17.031 238% 16-server 7.656 3.172 141% 16-client 57.984 16.531 251% These numbers include the overhead of a float loop and the method call overhead.
        Hide
        cutting Doug Cutting added a comment -

        How fast can you make:

        public byte floatToByte(float f, int numMantissaBits);

        ?

        That would be more reusable, and shouldn't be much slower...

        Show
        cutting Doug Cutting added a comment - How fast can you make: public byte floatToByte(float f, int numMantissaBits); ? That would be more reusable, and shouldn't be much slower...
        Hide
        yseeley@gmail.com Yonik Seeley added a comment -

        > How fast can you make: public byte floatToByte(float f, int numMantissaBits);

        With Java5 and -server -Xbatch, just as fast as the specialized version! That server JVM is amazing!
        With Java5 -client, it's 60% slower though...
        Still this code might be good to keep around for double checking implementations.

        public static byte floatToByte(float f, int numMantissaBits) {
        int rshift = 24-numMantissaBits; // 21 in old func
        int maxexp = 0xff >> numMantissaBits; // 31 in old func
        int zeroexp = 0xff >> (numMantissaBits+1); // 15 in old func
        // int overflowexp = 0x100 >> numMantissaBits; // 32 in old func
        int overflowexp = maxexp+1;
        int bits = Float.floatToRawIntBits(f);
        int smallfloat = bits >> rshift;
        if (smallfloat < (63-zeroexp)<<numMantissaBits)

        { return (bits<=0) ? (byte)0 : (byte)1; // 0 or underflow }

        else if (smallfloat >= (63-zeroexp+overflowexp)<<numMantissaBits)

        { return -1; }

        else

        { return (byte)(smallfloat - ((63-zeroexp)<<numMantissaBits)); }

        }

        public byte floatToByte(float f)

        { return floatToByte(f,3); }
        Show
        yseeley@gmail.com Yonik Seeley added a comment - > How fast can you make: public byte floatToByte(float f, int numMantissaBits); With Java5 and -server -Xbatch, just as fast as the specialized version! That server JVM is amazing! With Java5 -client, it's 60% slower though... Still this code might be good to keep around for double checking implementations. public static byte floatToByte(float f, int numMantissaBits) { int rshift = 24-numMantissaBits; // 21 in old func int maxexp = 0xff >> numMantissaBits; // 31 in old func int zeroexp = 0xff >> (numMantissaBits+1); // 15 in old func // int overflowexp = 0x100 >> numMantissaBits; // 32 in old func int overflowexp = maxexp+1; int bits = Float.floatToRawIntBits(f); int smallfloat = bits >> rshift; if (smallfloat < (63-zeroexp)<<numMantissaBits) { return (bits<=0) ? (byte)0 : (byte)1; // 0 or underflow } else if (smallfloat >= (63-zeroexp+overflowexp)<<numMantissaBits) { return -1; } else { return (byte)(smallfloat - ((63-zeroexp)<<numMantissaBits)); } } public byte floatToByte(float f) { return floatToByte(f,3); }
        Hide
        psmith@apache.org Paul Smith added a comment -

        If you can create a patch against 1.4.3 there is a reasonable possibility that I could create a 1.4.3 Lucene+ThisPatch jar and re-index in our test environment that was the source of the YourKit graph I provided earlier. This should reflect how useful the change might be against a decent baseline?

        Show
        psmith@apache.org Paul Smith added a comment - If you can create a patch against 1.4.3 there is a reasonable possibility that I could create a 1.4.3 Lucene+ThisPatch jar and re-index in our test environment that was the source of the YourKit graph I provided earlier. This should reflect how useful the change might be against a decent baseline?
        Hide
        yseeley@gmail.com Yonik Seeley added a comment -

        Here's a version that further generalizes the exponent zero point (below are negative exponents, above are positive), and includes the reverse byteToFloat as well.

        public static float byteToFloat(byte b, int numMantissaBits, int zeroExp)

        { if (b == 0) return 0.0f; int bits = (b&0xff) << (24-numMantissaBits); bits += (63-zeroExp) << 24; return Float.intBitsToFloat(bits); }

        public float byteToFloat(byte b)

        { return byteToFloat(b, 3, 15); }

        public static byte floatToByte(float f, int numMantissaBits, int zeroExp) {
        int shift = 24-numMantissaBits; // 21 in old func
        int maxexp = 0xff >> numMantissaBits; // 31 in old func
        // int zeroExp = 0xff >> (numMantissaBits+1); // 15 in old func
        // int overflowexp = 0x100 >> numMantissaBits; // 32 in old func
        int overflowexp = maxexp+1;
        int bits = Float.floatToRawIntBits(f);
        int smallfloat = bits >> shift;
        if (smallfloat < (63-zeroExp)<<numMantissaBits)

        { return (bits<=0) ? (byte)0 : (byte)1; // 0 or underflow }

        else if (smallfloat >= (63-zeroExp +overflowexp)<<numMantissaBits)

        { return -1; }

        else

        { return (byte)(smallfloat - ((63-zeroExp)<<numMantissaBits)); }

        }

        public byte floatToByte(float f)

        { return floatToByte(f,3,15); }
        Show
        yseeley@gmail.com Yonik Seeley added a comment - Here's a version that further generalizes the exponent zero point (below are negative exponents, above are positive), and includes the reverse byteToFloat as well. public static float byteToFloat(byte b, int numMantissaBits, int zeroExp) { if (b == 0) return 0.0f; int bits = (b&0xff) << (24-numMantissaBits); bits += (63-zeroExp) << 24; return Float.intBitsToFloat(bits); } public float byteToFloat(byte b) { return byteToFloat(b, 3, 15); } public static byte floatToByte(float f, int numMantissaBits, int zeroExp) { int shift = 24-numMantissaBits; // 21 in old func int maxexp = 0xff >> numMantissaBits; // 31 in old func // int zeroExp = 0xff >> (numMantissaBits+1); // 15 in old func // int overflowexp = 0x100 >> numMantissaBits; // 32 in old func int overflowexp = maxexp+1; int bits = Float.floatToRawIntBits(f); int smallfloat = bits >> shift; if (smallfloat < (63-zeroExp)<<numMantissaBits) { return (bits<=0) ? (byte)0 : (byte)1; // 0 or underflow } else if (smallfloat >= (63-zeroExp +overflowexp)<<numMantissaBits) { return -1; } else { return (byte)(smallfloat - ((63-zeroExp)<<numMantissaBits)); } } public byte floatToByte(float f) { return floatToByte(f,3,15); }
        Hide
        yseeley@gmail.com Yonik Seeley added a comment -

        Committed current implementation as SmallFloat
        http://svn.apache.org/viewcvs.cgi/lucene/java/trunk/src/java/org/apache/lucene/util/SmallFloat.java

        Unless I hear objections, I'll convert the norm encoding/decoding in Similarity to use it.

        Show
        yseeley@gmail.com Yonik Seeley added a comment - Committed current implementation as SmallFloat http://svn.apache.org/viewcvs.cgi/lucene/java/trunk/src/java/org/apache/lucene/util/SmallFloat.java Unless I hear objections, I'll convert the norm encoding/decoding in Similarity to use it.

          People

          • Assignee:
            yseeley@gmail.com Yonik Seeley
            Reporter:
            yseeley@gmail.com Yonik Seeley
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development