Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Trivial Trivial
    • Resolution: Fixed
    • Affects Version/s: 10.2.1.6
    • Fix Version/s: 10.3.1.4
    • Component/s: Miscellaneous, Services
    • Labels:
      None
    • Bug behavior facts:
      Performance

      Description

      The implementation of FormatableBitSet could be streamlined. Dead code can be removed and the implementation of some methods can be simplified.

      1. grow.diff
        2 kB
        Knut Anders Hatlen
      2. shrink.v2.diff
        17 kB
        Dyre Tjeldvoll
      3. shrink.v1.stat
        0.3 kB
        Dyre Tjeldvoll
      4. shrink.v1.diff
        17 kB
        Dyre Tjeldvoll
      5. numbitstest.diff
        3 kB
        Knut Anders Hatlen
      6. numbitsset.v2.diff
        2 kB
        Dyre Tjeldvoll
      7. anysetbit.v2.diff
        7 kB
        Dyre Tjeldvoll
      8. numbitsset.v1.stat
        0.2 kB
        Dyre Tjeldvoll
      9. numbitsset.v1.diff
        2 kB
        Dyre Tjeldvoll
      10. anysetbit.v1.stat
        0.3 kB
        Dyre Tjeldvoll
      11. anysetbit.v1.diff
        6 kB
        Dyre Tjeldvoll
      12. bitopt.v2.diff
        5 kB
        Dyre Tjeldvoll
      13. bitopt.v1.stat
        0.3 kB
        Dyre Tjeldvoll
      14. bitopt.v1.diff
        4 kB
        Dyre Tjeldvoll
      15. bitops.v3.diff
        19 kB
        Dyre Tjeldvoll
      16. bitops.v2.stat
        0.4 kB
        Dyre Tjeldvoll
      17. bitops.v2.diff
        19 kB
        Dyre Tjeldvoll
      18. bitops.v1.stat
        0.3 kB
        Dyre Tjeldvoll
      19. bitops.v1.diff
        19 kB
        Dyre Tjeldvoll
      20. boundarycheck.v1.stat
        0.3 kB
        Dyre Tjeldvoll
      21. boundarycheck.v1.diff
        8 kB
        Dyre Tjeldvoll
      22. unusedmethods.v1.stat
        0.7 kB
        Dyre Tjeldvoll
      23. unusedmethods.v1.diff
        9 kB
        Dyre Tjeldvoll
      24. cleanup2191.stat
        0.2 kB
        Knut Anders Hatlen
      25. cleanup2191.diff
        2 kB
        Knut Anders Hatlen
      26. valuenotnull.v1.stat
        0.5 kB
        Dyre Tjeldvoll
      27. valuenotnull.v1.diff
        8 kB
        Dyre Tjeldvoll
      28. fbstst.v1.stat
        0.6 kB
        Dyre Tjeldvoll
      29. fbstst.v1.diff
        26 kB
        Dyre Tjeldvoll
      30. FormatableBitSetTest.java
        17 kB
        Dyre Tjeldvoll
      31. deadcode.v2.diff
        1 kB
        Dyre Tjeldvoll
      32. deadcode.v1.diff
        1 kB

        Activity

        Hide
        Mike Matrigali added a comment -

        Is this one done? just checking to see if we can close some 10.3 targeted issues.

        Show
        Mike Matrigali added a comment - Is this one done? just checking to see if we can close some 10.3 targeted issues.
        Hide
        Knut Anders Hatlen added a comment -

        Committed grow.diff with revision 504460.

        Show
        Knut Anders Hatlen added a comment - Committed grow.diff with revision 504460.
        Hide
        Dyre Tjeldvoll added a comment -

        I think grow.diff should be committed.

        Show
        Dyre Tjeldvoll added a comment - I think grow.diff should be committed.
        Hide
        Knut Anders Hatlen added a comment -

        It seems to me that grow() is unnecessarily complex. It has some extra logic to handle the case where there are enough free bits in the last byte, but that does not seem any different from the case where the byte array contains enough free space. Also, it loops through the previously unused bytes and clears them, which should be unnecessary because the unused bytes should always be 0 (part of the invariant).

        grow.diff removes the unneeded special case and the clearing of the unused bytes. It also adds a call to ASSERT(invariantHolds()) in grow() and shrink(). Derbyall and the JUnit tests passed.

        Show
        Knut Anders Hatlen added a comment - It seems to me that grow() is unnecessarily complex. It has some extra logic to handle the case where there are enough free bits in the last byte, but that does not seem any different from the case where the byte array contains enough free space. Also, it loops through the previously unused bytes and clears them, which should be unnecessary because the unused bytes should always be 0 (part of the invariant). grow.diff removes the unneeded special case and the clearing of the unused bytes. It also adds a call to ASSERT(invariantHolds()) in grow() and shrink(). Derbyall and the JUnit tests passed.
        Hide
        Knut Anders Hatlen added a comment -

        Thanks Dyre! Committed shrink.v2 with revision 502185.

        Show
        Knut Anders Hatlen added a comment - Thanks Dyre! Committed shrink.v2 with revision 502185.
        Hide
        Dyre Tjeldvoll added a comment -

        shrink.v2 addresses the review comments.

        Show
        Dyre Tjeldvoll added a comment - shrink.v2 addresses the review comments.
        Hide
        Knut Anders Hatlen added a comment -

        A couple of tiny comments to shrink.v1:

        1) Javadoc for grow() says "Negative n are not allowed." Probably better (grammatically) with "neg. values of n are" or "neg. n's are".
        2) shrink() defines lastByteNum as numBytesFromBits - 1, but later we have to add one to use it. Would it be (slightly) better to call it firstUnusedByte and not subtract one until we need the last byte?
        3) testGrowSmaller() and testGrow0() have lines without indentation.

        Show
        Knut Anders Hatlen added a comment - A couple of tiny comments to shrink.v1: 1) Javadoc for grow() says "Negative n are not allowed." Probably better (grammatically) with "neg. values of n are" or "neg. n's are". 2) shrink() defines lastByteNum as numBytesFromBits - 1, but later we have to add one to use it. Would it be (slightly) better to call it firstUnusedByte and not subtract one until we need the last byte? 3) testGrowSmaller() and testGrow0() have lines without indentation.
        Hide
        Dyre Tjeldvoll added a comment -

        I think a single method makes a lot of sense, and I would seriously consider it if it wasn't for the large number of files that would have to be changed. (Which is also why I didn't rename anySetBit() to something more meaningful. e.g. firstSetBit()).

        Show
        Dyre Tjeldvoll added a comment - I think a single method makes a lot of sense, and I would seriously consider it if it wasn't for the large number of files that would have to be changed. (Which is also why I didn't rename anySetBit() to something more meaningful. e.g. firstSetBit()).
        Hide
        Bryan Pendleton added a comment -

        > there is an asymmetry between grow(int) and shrink(int)

        Would it make sense to collapse grow() and shrink() into a single method,
        perhaps called something like "setBitSize()" or "setCapacity"? It might make
        it more clear at the caller's level that it doesn't matter whether you are
        making the bitset larger or smaller; you just need to specify the new size.

        Show
        Bryan Pendleton added a comment - > there is an asymmetry between grow(int) and shrink(int) Would it make sense to collapse grow() and shrink() into a single method, perhaps called something like "setBitSize()" or "setCapacity"? It might make it more clear at the caller's level that it doesn't matter whether you are making the bitset larger or smaller; you just need to specify the new size.
        Hide
        Dyre Tjeldvoll added a comment -

        Attaching shrink.v1 which tries get rid of some of the unexpected behavior when growing and shrinking bitsets (commented in the unit test). The goal was that both grow(int) and shrink(int) should disallow negative arguments and grow(int) should only to accept arguments larger than the current bitset size, and by symmetry shrink(int) should only allow arguments smaller than the current size. Unfortunately, that is currently not possible since the existing usage requires grow(int n) where n is smaller than the current size to be a noop. So in the patch there is an asymmetry between grow(int) and shrink(int) where shrink(int) will not accept n larger than the current size.

        The patch also changes the order of the asserts in the tests so that the assert on the length of the byte array always is the last assert. This is necessary since getByteArray() trims away any unused bytes of the byte array making any later invariant checks etc. worthless.

        The patch fixes a bug in shrink(int). Without this patch shrink(int) will not clear any bytes beyond the new last byte. E.g. if a bitset is shrunk from 18 to 9 bits, the now unused bits in the second byte will be cleared, but NOT the bits in the third byte. This error was not caught by any tests because they called getByteArray() before checking the invariant.

        Show
        Dyre Tjeldvoll added a comment - Attaching shrink.v1 which tries get rid of some of the unexpected behavior when growing and shrinking bitsets (commented in the unit test). The goal was that both grow(int) and shrink(int) should disallow negative arguments and grow(int) should only to accept arguments larger than the current bitset size, and by symmetry shrink(int) should only allow arguments smaller than the current size. Unfortunately, that is currently not possible since the existing usage requires grow(int n) where n is smaller than the current size to be a noop. So in the patch there is an asymmetry between grow(int) and shrink(int) where shrink(int) will not accept n larger than the current size. The patch also changes the order of the asserts in the tests so that the assert on the length of the byte array always is the last assert. This is necessary since getByteArray() trims away any unused bytes of the byte array making any later invariant checks etc. worthless. The patch fixes a bug in shrink(int). Without this patch shrink(int) will not clear any bytes beyond the new last byte. E.g. if a bitset is shrunk from 18 to 9 bits, the now unused bits in the second byte will be cleared, but NOT the bits in the third byte. This error was not caught by any tests because they called getByteArray() before checking the invariant.
        Hide
        Knut Anders Hatlen added a comment -

        Committed numbitstest.diff with revision 501368.

        Show
        Knut Anders Hatlen added a comment - Committed numbitstest.diff with revision 501368.
        Hide
        Knut Anders Hatlen added a comment -

        numbitstest.diff adds the tests I ran to verify that getNumBitsSet() returns the same number of bits as Integer.bitCount(). The test cases test all byte values from Byte.MIN_VALUE to Byte.MAX_VALUE for a one-byte bit set and a two-byte bit set.

        Show
        Knut Anders Hatlen added a comment - numbitstest.diff adds the tests I ran to verify that getNumBitsSet() returns the same number of bits as Integer.bitCount(). The test cases test all byte values from Byte.MIN_VALUE to Byte.MAX_VALUE for a one-byte bit set and a two-byte bit set.
        Hide
        Knut Anders Hatlen added a comment -

        numbitsset.v2 also looks good. I verified that the algorithm that calculated the number of bits in a byte returned the same value as Integer.bitCount(0xff & b) which is included in jdk>=1.5.

        Committed revision 501334.

        Show
        Knut Anders Hatlen added a comment - numbitsset.v2 also looks good. I verified that the algorithm that calculated the number of bits in a byte returned the same value as Integer.bitCount(0xff & b) which is included in jdk>=1.5. Committed revision 501334.
        Hide
        Knut Anders Hatlen added a comment -

        anysetbit.v2 looks good and the tests passed. Committed revision 501095.

        Show
        Knut Anders Hatlen added a comment - anysetbit.v2 looks good and the tests passed. Committed revision 501095.
        Hide
        Dyre Tjeldvoll added a comment -

        New patch which addresses the review comments.

        Show
        Dyre Tjeldvoll added a comment - New patch which addresses the review comments.
        Hide
        Øystein Grøvlen added a comment -

        > Dyre Tjeldvoll commented on DERBY-2191:
        > ---------------------------------------
        >
        > I agree that the javadoc comment for anySetBit should be improved,
        > and will include this in the next version of the patch.

        Good.

        >
        > I'll take a look at ResultColumnList.generateHolderMethod() and see
        > if anything can be done.

        Before putting too much effort into that, I think you should consider
        whether this is a time critical function. Since its seem to be used
        during compilation, it might not be that critical.

        >
        > Wrt. firstSet(): I did consider your approach, but that will always
        > result in 4 comparisons. The current approach will give an average
        > of 4 comparisons if the set bits are uniformly distributed in the
        > byte. If you look at how firstSet is used in anySetBit(int) you'll
        > see that the argument is frequently shifted a number of bits to the
        > left. So the bits are not uniformly distributed, but rather
        > clustered in the beginning of the byte.

        Good point, and thinking a bit more about this, even if my suggestion
        will always give 3 (not 4!) comparisons, I think your way is probably
        better also when you do not take the shifting into account. If all
        bit combinations are equally likely, 50% of the cases will only need
        one comparison, and the average number of comparisons will be just
        below 2. (I think your numbers assume that only one bit is set at a
        time.)

        Show
        Øystein Grøvlen added a comment - > Dyre Tjeldvoll commented on DERBY-2191 : > --------------------------------------- > > I agree that the javadoc comment for anySetBit should be improved, > and will include this in the next version of the patch. Good. > > I'll take a look at ResultColumnList.generateHolderMethod() and see > if anything can be done. Before putting too much effort into that, I think you should consider whether this is a time critical function. Since its seem to be used during compilation, it might not be that critical. > > Wrt. firstSet(): I did consider your approach, but that will always > result in 4 comparisons. The current approach will give an average > of 4 comparisons if the set bits are uniformly distributed in the > byte. If you look at how firstSet is used in anySetBit(int) you'll > see that the argument is frequently shifted a number of bits to the > left. So the bits are not uniformly distributed, but rather > clustered in the beginning of the byte. Good point, and thinking a bit more about this, even if my suggestion will always give 3 (not 4!) comparisons, I think your way is probably better also when you do not take the shifting into account. If all bit combinations are equally likely, 50% of the cases will only need one comparison, and the average number of comparisons will be just below 2. (I think your numbers assume that only one bit is set at a time.)
        Hide
        Dyre Tjeldvoll added a comment -

        New patch which addresses the review comments.

        Show
        Dyre Tjeldvoll added a comment - New patch which addresses the review comments.
        Hide
        Dyre Tjeldvoll added a comment -

        I agree that the javadoc comment for anySetBit should be improved, and will include this in the next version of the patch.

        I'll take a look at ResultColumnList.generateHolderMethod() and see if anything can be done.

        Wrt. firstSet(): I did consider your approach, but that will always result in 4 comparisons. The current approach will give an average of 4 comparisons if the set bits are uniformly distributed in the byte. If you look at how firstSet is used in anySetBit(int) you'll see that the argument is frequently shifted a number of bits to the left. So the bits are not uniformly distributed, but rather clustered in the beginning of the byte.

        Show
        Dyre Tjeldvoll added a comment - I agree that the javadoc comment for anySetBit should be improved, and will include this in the next version of the patch. I'll take a look at ResultColumnList.generateHolderMethod() and see if anything can be done. Wrt. firstSet(): I did consider your approach, but that will always result in 4 comparisons. The current approach will give an average of 4 comparisons if the set bits are uniformly distributed in the byte. If you look at how firstSet is used in anySetBit(int) you'll see that the argument is frequently shifted a number of bits to the left. So the bits are not uniformly distributed, but rather clustered in the beginning of the byte.
        Hide
        Øystein Grøvlen added a comment -

        The javadoc for anySetBit says that it will "return the bit number of
        a bit that is set". If I understand the implementation correctly, it
        will always return the first bit that is set. Looking at some of the
        usages of anySetBit, it seems like this behavior is also assumed. For
        example, it seems to be used to iterate over all set bits of the bit
        set (e.g., BasicNoPutResultSetImpl.getCompactRow()).

        It think it would be a good idea to give this function a more
        descriptive name or at least change the javadoc to reflect the assumed
        and implemented behavior.

        I also seems that at least for one usage of anySetBit(), returning the
        first set bit, is the least optimal (see
        ResultColumnList.generateHolderMethod(), which tries to find the
        maximum column number for which the corresponding bit is set).

        For firstSet(), you can reduce the average number of comparisons by
        doing binary search, something like:

        if ((v & 0xf0) != 0) {
        if ((v & 0xc0) != 0) {
        if ((v & 0x80) != 0)

        { return 0; }

        else

        { return 0x1; }

        } else {
        if ((v & 0x20) != 0)

        { return 0x2; }

        else

        { return 0x3; }

        }
        else {
        if ((v & 0xc) != 0) {
        ...

        However, I cannot guarantee that a smaller number of comparisons will
        give better performance. I have no idea what impact this has with
        respect to how the processsor does instruction preloading etc.

        Show
        Øystein Grøvlen added a comment - The javadoc for anySetBit says that it will "return the bit number of a bit that is set". If I understand the implementation correctly, it will always return the first bit that is set. Looking at some of the usages of anySetBit, it seems like this behavior is also assumed. For example, it seems to be used to iterate over all set bits of the bit set (e.g., BasicNoPutResultSetImpl.getCompactRow()). It think it would be a good idea to give this function a more descriptive name or at least change the javadoc to reflect the assumed and implemented behavior. I also seems that at least for one usage of anySetBit(), returning the first set bit, is the least optimal (see ResultColumnList.generateHolderMethod(), which tries to find the maximum column number for which the corresponding bit is set). For firstSet(), you can reduce the average number of comparisons by doing binary search, something like: if ((v & 0xf0) != 0) { if ((v & 0xc0) != 0) { if ((v & 0x80) != 0) { return 0; } else { return 0x1; } } else { if ((v & 0x20) != 0) { return 0x2; } else { return 0x3; } } else { if ((v & 0xc) != 0) { ... However, I cannot guarantee that a smaller number of comparisons will give better performance. I have no idea what impact this has with respect to how the processsor does instruction preloading etc.
        Hide
        Knut Anders Hatlen added a comment -

        I have looked at anysetbit.v1 and numbitsset.v1. I think the changes look very good. And thank you for documenting the algorithm used in getNumBitsSet() so thoroughly!

        Some small comments:
        1) In anySetBits(), perhaps "return (umul8|firstSet(v));" would be clearer if it used plus instead of or?
        2) Since anySetBits() and getNumBitsSet() now use all bits in the last byte, maybe they should start with "ASSERT(invariantHolds())"?
        3) I think it would be good if firstSet() had a short comment explaining what it's supposed to do.
        4) Would it be better if the return statements in firstSet() didn't use hex format? Since the returned values are positions and not bit patterns, I mean.

        Show
        Knut Anders Hatlen added a comment - I have looked at anysetbit.v1 and numbitsset.v1. I think the changes look very good. And thank you for documenting the algorithm used in getNumBitsSet() so thoroughly! Some small comments: 1) In anySetBits(), perhaps "return (umul8 |firstSet(v));" would be clearer if it used plus instead of or? 2) Since anySetBits() and getNumBitsSet() now use all bits in the last byte, maybe they should start with "ASSERT(invariantHolds())"? 3) I think it would be good if firstSet() had a short comment explaining what it's supposed to do. 4) Would it be better if the return statements in firstSet() didn't use hex format? Since the returned values are positions and not bit patterns, I mean.
        Hide
        Dyre Tjeldvoll added a comment -

        Attaching numbitsset.v1 which re-writes the method getNumBitsSet() to do a byte-by-byte calculation. Tests pass. Please review.

        Show
        Dyre Tjeldvoll added a comment - Attaching numbitsset.v1 which re-writes the method getNumBitsSet() to do a byte-by-byte calculation. Tests pass. Please review.
        Hide
        Dyre Tjeldvoll added a comment -

        Attaching anysetbit.v1 which re-writes anySetBit and anySetBit(int beyondBit) to use a single loop and removes special handling of the last byte. Tests pass. Please review.

        Show
        Dyre Tjeldvoll added a comment - Attaching anysetbit.v1 which re-writes anySetBit and anySetBit(int beyondBit) to use a single loop and removes special handling of the last byte. Tests pass. Please review.
        Hide
        Knut Anders Hatlen added a comment -

        Thank you for addressing my comments! bitopt.v2 looks good. Committed revision 500177.

        Show
        Knut Anders Hatlen added a comment - Thank you for addressing my comments! bitopt.v2 looks good. Committed revision 500177.
        Hide
        Dyre Tjeldvoll added a comment -

        Attaching bitopt.v2 to address the review comments.

        Show
        Dyre Tjeldvoll added a comment - Attaching bitopt.v2 to address the review comments.
        Hide
        Knut Anders Hatlen added a comment -

        I think the rewrite of numBitsInLastByte() made the code clearer. The error checks in the constructor and shrink() also seem useful. Two comments:
        1) umul8() doesn't seem to be used.
        2) I think isSet(), set() and clear() became more difficult to read. I think it would be better if they kept the old style where the byte number and bit position were calculated separately and put into variables with meaningful names (they can still use the new methods).

        Show
        Knut Anders Hatlen added a comment - I think the rewrite of numBitsInLastByte() made the code clearer. The error checks in the constructor and shrink() also seem useful. Two comments: 1) umul8() doesn't seem to be used. 2) I think isSet(), set() and clear() became more difficult to read. I think it would be better if they kept the old style where the byte number and bit position were calculated separately and put into variables with meaningful names (they can still use the new methods).
        Hide
        Dyre Tjeldvoll added a comment -

        Attaching bitopt.v1 which introduces the unsigned arithmetic optimization discussed on derby-dev earlier. The optimization has been isolated in static utility methods. derbyall and suites.All pass.

        Show
        Dyre Tjeldvoll added a comment - Attaching bitopt.v1 which introduces the unsigned arithmetic optimization discussed on derby-dev earlier. The optimization has been isolated in static utility methods. derbyall and suites.All pass.
        Hide
        Knut Anders Hatlen added a comment -

        bitops.v3 looks good the tests passed. Fixed a copy/paste error in xor's javadoc (AND -> XOR) and committed revision 498772.

        Show
        Knut Anders Hatlen added a comment - bitops.v3 looks good the tests passed. Fixed a copy/paste error in xor's javadoc (AND -> XOR) and committed revision 498772.
        Hide
        Dyre Tjeldvoll added a comment -

        Attached bitops.v3 which removes the unnecessary check for getLength()==0

        Show
        Dyre Tjeldvoll added a comment - Attached bitops.v3 which removes the unnecessary check for getLength()==0
        Hide
        Knut Anders Hatlen added a comment -

        Thanks Dyre, bitops.v2 looks good! I only have one tiny comment: The patch copies this pattern (not introduced by you) from or() to and() and xor():

        if (otherBit == null || otherBit.getLength() == 0)
        return;

        int otherLength = otherBit.getLength();

        For non-empty bitsets (which I expect is the normal case) otherBit.getLength() is called twice. It seems like all three methods will handle empty bitsets correctly without the check for getLength()==0, so perhaps it would be better to remove it.

        Show
        Knut Anders Hatlen added a comment - Thanks Dyre, bitops.v2 looks good! I only have one tiny comment: The patch copies this pattern (not introduced by you) from or() to and() and xor(): if (otherBit == null || otherBit.getLength() == 0) return; int otherLength = otherBit.getLength(); For non-empty bitsets (which I expect is the normal case) otherBit.getLength() is called twice. It seems like all three methods will handle empty bitsets correctly without the check for getLength()==0, so perhaps it would be better to remove it.
        Hide
        Knut Anders Hatlen added a comment -

        > When I think about it, I don't think should matter what the state of the unused bits is, at least as long as all other operations respect the lengthAsBits member

        That's a good point. However, if we knew for sure that the unused bits are always unset, it would also be possible to perform bytewise operations in getNumBitsSet().

        Show
        Knut Anders Hatlen added a comment - > When I think about it, I don't think should matter what the state of the unused bits is, at least as long as all other operations respect the lengthAsBits member That's a good point. However, if we knew for sure that the unused bits are always unset, it would also be possible to perform bytewise operations in getNumBitsSet().
        Hide
        Dyre Tjeldvoll added a comment -

        bitops.v2 adds the requested ASSERT.

        Show
        Dyre Tjeldvoll added a comment - bitops.v2 adds the requested ASSERT.
        Hide
        Dyre Tjeldvoll added a comment -

        Actually I did run derbyall and suites.All with such ASSERTS in my sandbox yesterday, without seeing any failures. But I removed them before creating the patch But, sure I could put them back. When I think about it, I don't think should matter what the state of the unused bits is, at least as long as all other operations respect the lengthAsBits member (I haven't verified that this is in fact the case, but seems to be the intention), and that grow() does the proper initialization. But I still think it is good to maintain this as an invariant because it makes it easier to detect errors.

        Show
        Dyre Tjeldvoll added a comment - Actually I did run derbyall and suites.All with such ASSERTS in my sandbox yesterday, without seeing any failures. But I removed them before creating the patch But, sure I could put them back. When I think about it, I don't think should matter what the state of the unused bits is, at least as long as all other operations respect the lengthAsBits member (I haven't verified that this is in fact the case, but seems to be the intention), and that grow() does the proper initialization. But I still think it is good to maintain this as an invariant because it makes it easier to detect errors.
        Hide
        Knut Anders Hatlen added a comment -

        This sounds like a very good change. Would it be good to call have an assert in or/and/xor which tested invariantHolds()? Now that they perform the operations bytewise, they require that the extra bits are unset, and I think that's worth testing in sane builds.

        Show
        Knut Anders Hatlen added a comment - This sounds like a very good change. Would it be good to call have an assert in or/and/xor which tested invariantHolds()? Now that they perform the operations bytewise, they require that the extra bits are unset, and I think that's worth testing in sane builds.
        Hide
        Dyre Tjeldvoll added a comment -

        Attached another patch (bitops.v1) which changes the bitset operator methods or(), and() and xor() so that they follow the same pattern. That is; they all accept null as an argument and treats that as an empty bitset, and they all allow operands of all sizes and handles them the same way. All are now performing the operation bytewise, and there is no special handling of the last partial byte. The patch also adds a method called invariantHolds() that checks if the class' invariant is maintained (for use in the unit test).

        Show
        Dyre Tjeldvoll added a comment - Attached another patch (bitops.v1) which changes the bitset operator methods or(), and() and xor() so that they follow the same pattern. That is; they all accept null as an argument and treats that as an empty bitset, and they all allow operands of all sizes and handles them the same way. All are now performing the operation bytewise, and there is no special handling of the last partial byte. The patch also adds a method called invariantHolds() that checks if the class' invariant is maintained (for use in the unit test).
        Hide
        Knut Anders Hatlen added a comment -

        Committed cleanup2191.diff with revision 497003.
        Committed unusedmethods.v1.diff with revision 497396.
        Committed boundarycheck.v1.diff revision 497398.

        Show
        Knut Anders Hatlen added a comment - Committed cleanup2191.diff with revision 497003. Committed unusedmethods.v1.diff with revision 497396. Committed boundarycheck.v1.diff revision 497398.
        Hide
        Dyre Tjeldvoll added a comment -

        I've attached boundarycheck.v1 which adds argument checking to isSet, set and clear. Please review.

        Show
        Dyre Tjeldvoll added a comment - I've attached boundarycheck.v1 which adds argument checking to isSet, set and clear. Please review.
        Hide
        Dyre Tjeldvoll added a comment -

        I have looked at Knut's patch and I think those changes are good and should be committed.

        Knut's observation that small negative bit indexes will not trigger an ArrayIndexOutOfBoundsException is absolutely correct, so I think I'll submit my patch for this.

        I just uploaded a patch (unusedmethods.v1) which removes unused methods. derbyall and suites.All pass.

        Show
        Dyre Tjeldvoll added a comment - I have looked at Knut's patch and I think those changes are good and should be committed. Knut's observation that small negative bit indexes will not trigger an ArrayIndexOutOfBoundsException is absolutely correct, so I think I'll submit my patch for this. I just uploaded a patch (unusedmethods.v1) which removes unused methods. derbyall and suites.All pass.
        Hide
        Knut Anders Hatlen added a comment -

        When I looked at the previous patches, I noticed some other minor issues that could be fixed in FormatableBitSet. The proposed changes are attached as cleanup2191.diff:

        • Remove unused import java.io.InputStream.
        • numBytesFromBits() is unnecessarily complex. Currently, it returns (bits == 0) ? 0 : ((bits - 1) / 8) + 1, but for non-negative values of bits, (bits + 7) >> 3 gives the same results (and bits should never be negative).
        • Remove unused variables in toString().
        Show
        Knut Anders Hatlen added a comment - When I looked at the previous patches, I noticed some other minor issues that could be fixed in FormatableBitSet. The proposed changes are attached as cleanup2191.diff: Remove unused import java.io.InputStream. numBytesFromBits() is unnecessarily complex. Currently, it returns (bits == 0) ? 0 : ((bits - 1) / 8) + 1, but for non-negative values of bits, (bits + 7) >> 3 gives the same results (and bits should never be negative). Remove unused variables in toString().
        Hide
        Knut Anders Hatlen added a comment -

        If we have a check for the position, I think we should test both whether it is too big and whether it is too small. None of those cases should happen, and I don't think bugs causing negative positions are less likely or less severe than bugs causing too big positions. If negative positions always caused other exceptions, it would probably be OK to skip the test, but I'm not sure they always do. At least, it doesn't seem like isSet(), set() and clear() always throw exceptions when the position is negative. I think it must be sufficiently negative compared to the size of the bit set before an exception is thrown. In FormatableBitSetTest, set(-1) and clear(-1) fail when the set is empty, but don't fail otherwise. isSet(-1) doesn't fail in any case.

        Show
        Knut Anders Hatlen added a comment - If we have a check for the position, I think we should test both whether it is too big and whether it is too small. None of those cases should happen, and I don't think bugs causing negative positions are less likely or less severe than bugs causing too big positions. If negative positions always caused other exceptions, it would probably be OK to skip the test, but I'm not sure they always do. At least, it doesn't seem like isSet(), set() and clear() always throw exceptions when the position is negative. I think it must be sufficiently negative compared to the size of the bit set before an exception is thrown. In FormatableBitSetTest, set(-1) and clear(-1) fail when the set is empty, but don't fail otherwise. isSet(-1) doesn't fail in any case.
        Hide
        Dyre Tjeldvoll added a comment -

        I'm looking at how boundary checking could be improved in this class, but I'm a bit unsure about what the best approach is. Right now I have a patch in my sandbox that introduces a private checkPosition() method that is used in all accessors (isSet(), set(), clear()), which throws IllegalArgumentException if the position is negative or larger than the largest legal bit position. This gives a clean interface where all illegal positions are handled uniformly and the test looks cleaner. But I'm a bit uncomfortable with all this extra checking for an internal interface that only developers will use.

        If we accept that using a negative bit doesn't result in the same exception being thrown we could cut down on the testing since a negative bit index always will result in an exception. If we can accept that access to illegal bit positions in the last byte go undetected we could drop the extra check altogether... (there is, of course, the issue of what set(illegalIndex) should do. Nothing?)

        Another more radical solution is to require bitset sizes to be a multiple of 8. Then one would not need to track the number of bits used. This would only be possible if we remove concatenate() (which is unused).

        Show
        Dyre Tjeldvoll added a comment - I'm looking at how boundary checking could be improved in this class, but I'm a bit unsure about what the best approach is. Right now I have a patch in my sandbox that introduces a private checkPosition() method that is used in all accessors (isSet(), set(), clear()), which throws IllegalArgumentException if the position is negative or larger than the largest legal bit position. This gives a clean interface where all illegal positions are handled uniformly and the test looks cleaner. But I'm a bit uncomfortable with all this extra checking for an internal interface that only developers will use. If we accept that using a negative bit doesn't result in the same exception being thrown we could cut down on the testing since a negative bit index always will result in an exception. If we can accept that access to illegal bit positions in the last byte go undetected we could drop the extra check altogether... (there is, of course, the issue of what set(illegalIndex) should do. Nothing?) Another more radical solution is to require bitset sizes to be a multiple of 8. Then one would not need to track the number of bits used. This would only be possible if we remove concatenate() (which is unused).
        Hide
        Knut Anders Hatlen added a comment -

        Verified that there is no way to set value to null. All tests ran cleanly.
        Committed revision 496719.

        Show
        Knut Anders Hatlen added a comment - Verified that there is no way to set value to null. All tests ran cleanly. Committed revision 496719.
        Hide
        Dyre Tjeldvoll added a comment -

        I've attached a patch (valuenotnull.v1) which ensures that the 'value' byte array never is null, and removes the now redundant null-checks. The default constructor initializes 'value' to a shared zero-length array instance taken from ReuseFactory Pleas review.

        Show
        Dyre Tjeldvoll added a comment - I've attached a patch (valuenotnull.v1) which ensures that the 'value' byte array never is null, and removes the now redundant null-checks. The default constructor initializes 'value' to a shared zero-length array instance taken from ReuseFactory Pleas review.
        Hide
        Dyre Tjeldvoll added a comment -

        committed by KAH

        Show
        Dyre Tjeldvoll added a comment - committed by KAH
        Hide
        Knut Anders Hatlen added a comment -

        I have committed fbstst.v1.diff to trunk with revision 496312. I made one minor change before committing: Some of the test cases used assertEquals(double, double, double) to compare two integers. Changed them to assertEquals(int, int) instead.

        Show
        Knut Anders Hatlen added a comment - I have committed fbstst.v1.diff to trunk with revision 496312. I made one minor change before committing: Some of the test cases used assertEquals(double, double, double) to compare two integers. Changed them to assertEquals(int, int) instead.
        Hide
        Dyre Tjeldvoll added a comment -

        I can confirm that the following methods aren't used and can be removed:

        public FormatableBitSet(byte[] newValue, int numBits)
        public FormatableBitSet concatenate(FormatableBitSet other)
        private static byte hexCharToByte(char hexChar)
        private static char[] decodeArray =

        {'0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'a', 'b', 'c', 'd', 'e', 'f'}

        ;
        public void readExternalFromArray(ArrayInputStream in) throws IOException

        I plan to submit a patch for this when the JUnit test has been committed (since removing these method will require changes to the test)

        Show
        Dyre Tjeldvoll added a comment - I can confirm that the following methods aren't used and can be removed: public FormatableBitSet(byte[] newValue, int numBits) public FormatableBitSet concatenate(FormatableBitSet other) private static byte hexCharToByte(char hexChar) private static char[] decodeArray = {'0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'a', 'b', 'c', 'd', 'e', 'f'} ; public void readExternalFromArray(ArrayInputStream in) throws IOException I plan to submit a patch for this when the JUnit test has been committed (since removing these method will require changes to the test)
        Hide
        Dyre Tjeldvoll added a comment -

        I've attached a patch (fbstst.v1) that adds (a modified) FormatableBitSetTest.java as a JUnit unit test. The patch creates a new junit package under the unitTests package with a corresponding _Suite class, as discussed on derby-dev. No new JUnit harness was created at the top-level, but the unitTests.junit._Suite was added to the functionTests.allPackages suite.

        suites.All ran successfully in both sane and insane mode.

        Note! Patch contains one new directory and two new files. See attached svn stat output for details.

        Please review.

        Show
        Dyre Tjeldvoll added a comment - I've attached a patch (fbstst.v1) that adds (a modified) FormatableBitSetTest.java as a JUnit unit test. The patch creates a new junit package under the unitTests package with a corresponding _Suite class, as discussed on derby-dev. No new JUnit harness was created at the top-level, but the unitTests.junit._Suite was added to the functionTests.allPackages suite. suites.All ran successfully in both sane and insane mode. Note! Patch contains one new directory and two new files. See attached svn stat output for details. Please review.
        Hide
        Dyre Tjeldvoll added a comment -

        Thank you for looking at the results

        Some answers to your questions:

        • Yes, I would prefer to remove code that isn't used. (Only caveat is that the two-arg constructor is really useful when writing tests. Particularly if you want to create a bitset that doesn't have a multiple of 8 number of bytes)
        • Yes, I would prefer to have value!=null be an ivariant as well (Possible caveat; can you assign to value in the ctor used when doing de-serialization?)
        • No, I don't like the way asserts are used either. Apart from what you mention it also makes it hard(er) to write tests that will work both in sane and insane mode.
        • Yes, would prefer that isSet didn't catch ArrayIndexOutOfBoundsException, or always checked the validity of its arguments. If you want all access to invalid bit indices to be handled the same way, I think you will have to check the argument manually since you are not guaranteed that access to a bit index larger than the bitset's max index will trigger an ArrayIndexOutOfBoundsException.
        Show
        Dyre Tjeldvoll added a comment - Thank you for looking at the results Some answers to your questions: Yes, I would prefer to remove code that isn't used. (Only caveat is that the two-arg constructor is really useful when writing tests. Particularly if you want to create a bitset that doesn't have a multiple of 8 number of bytes) Yes, I would prefer to have value!=null be an ivariant as well (Possible caveat; can you assign to value in the ctor used when doing de-serialization?) No, I don't like the way asserts are used either. Apart from what you mention it also makes it hard(er) to write tests that will work both in sane and insane mode. Yes, would prefer that isSet didn't catch ArrayIndexOutOfBoundsException, or always checked the validity of its arguments. If you want all access to invalid bit indices to be handled the same way, I think you will have to check the argument manually since you are not guaranteed that access to a bit index larger than the bitset's max index will trigger an ArrayIndexOutOfBoundsException.
        Hide
        Knut Anders Hatlen added a comment -

        Hi Dyre,

        Thanks for writing the test and posting the issues!

        Some comments/questions to the results from your test:

        • I think some of the methods that have issues/bugs are not used
          (seems to be the case for concatenate and two-arg constructor, at
          least). Would it be better to remove the dead/broken code than to
          fix it?
        • Some of the issues/inconsistencies are caused by the handling of
          (value == null). In the code coverage report, it seems like the code
          paths for null handling are rarely (never?) exercised. Is it
          possible to make (value != null) invariant (for instance by checking
          in the constructor)? Then we could remove most of the null checks. I
          think that would make the code cleaner (and it would increase the
          code coverage percentage).
        • I would expect and(), or() and xor() to handle arguments (null,
          differing length, etc) the same way. Since FormatableBitSet is a
          library class in the services.io package, I think it is strange if
          its methods throw assert error for special cases or boundary
          conditions. In my opinion, such asserts belong at a higher level
          where one can tell from the context that it's a bad state.
        • In addition to the issue you noticed, I think isSet() might hide
          some errors because it catches ArrayIndexOutOfBoundsException and
          returns false (in insane mode). Do you think it would be safe to
          remove the try/catch? To me, it sounds more correct to propagate the
          exception than to silently ignore it.
        Show
        Knut Anders Hatlen added a comment - Hi Dyre, Thanks for writing the test and posting the issues! Some comments/questions to the results from your test: I think some of the methods that have issues/bugs are not used (seems to be the case for concatenate and two-arg constructor, at least). Would it be better to remove the dead/broken code than to fix it? Some of the issues/inconsistencies are caused by the handling of (value == null). In the code coverage report, it seems like the code paths for null handling are rarely (never?) exercised. Is it possible to make (value != null) invariant (for instance by checking in the constructor)? Then we could remove most of the null checks. I think that would make the code cleaner (and it would increase the code coverage percentage). I would expect and(), or() and xor() to handle arguments (null, differing length, etc) the same way. Since FormatableBitSet is a library class in the services.io package, I think it is strange if its methods throw assert error for special cases or boundary conditions. In my opinion, such asserts belong at a higher level where one can tell from the context that it's a bad state. In addition to the issue you noticed, I think isSet() might hide some errors because it catches ArrayIndexOutOfBoundsException and returns false (in insane mode). Do you think it would be safe to remove the try/catch? To me, it sounds more correct to propagate the exception than to silently ignore it.
        Hide
        Dyre Tjeldvoll added a comment -

        Here is a brief list of bugs/issues that was found while writing FormatableBitSetTest.java:

        1) The two-arg constructor fails if given a size arg that is smaller than the size of the byte array arg

        2) When the copy constructor is used to copy an empty bitset with a null byte array, the resulting copy has a non-null byte array

        3) It is legal to call grow(int) with an arg that is smaller that the bitset's current size, but the size of the bitset remains the same.

        4) It is legal to call grow(int) with a negative argument, but the size of the bitset remains the same.

        5) Shrinking an empty bitset to size 0 causes an AssertFailure

        6) It is legal to call shrink(int) with an arg that is larger that the bitset's current size, but the size of the bitset remains the same.

        7) Shrinking an ordinary bitset to size 0 causes an AssertFailure

        8) The return value of compare doesn't match what is stated in the javadoc for the method

        9) The concatenate method does not work. Concatenation of two 18 bit bitsets yields a bitset that is still 18 bits.

        10) The method isSet(-1) gives an NPE for an empty bitset, but returns false for a non-empty set.

        11) The method set(-1) gives an NPE for an empty bitset, but does nothing for a non-empty set.

        12) The method clear(-1) gives an NPE for an empty bitset, but does nothing for a non-empty set.

        13) The method anySetBit() gives an NPE for an empty bitset, -1 would be more reasonable

        14) The method anySetBit(-1) is equivalent to anySetBit(), but smaller args (-2, -3) do not throw an exception, but also do not necessarily return the same value as -1

        15) The method or(FormatableBitSet) accepts null as an argument and treats it as if it represents an empty bitset. and(FormatableBitSet) will throw an AssertFailure, but xor(FormatableBitSet) throws an NPE

        16) The methods or(FormatableBitSet) and and(FormatableBitSet) can be used with args that are smaller or larger. xor() requires the arg to have the same size as "this"

        17) readExternalFromArray does not appear to have a write counterpart. Using it on something that is wtitten with writeExternal does not seem to work

        Show
        Dyre Tjeldvoll added a comment - Here is a brief list of bugs/issues that was found while writing FormatableBitSetTest.java: 1) The two-arg constructor fails if given a size arg that is smaller than the size of the byte array arg 2) When the copy constructor is used to copy an empty bitset with a null byte array, the resulting copy has a non-null byte array 3) It is legal to call grow(int) with an arg that is smaller that the bitset's current size, but the size of the bitset remains the same. 4) It is legal to call grow(int) with a negative argument, but the size of the bitset remains the same. 5) Shrinking an empty bitset to size 0 causes an AssertFailure 6) It is legal to call shrink(int) with an arg that is larger that the bitset's current size, but the size of the bitset remains the same. 7) Shrinking an ordinary bitset to size 0 causes an AssertFailure 8) The return value of compare doesn't match what is stated in the javadoc for the method 9) The concatenate method does not work. Concatenation of two 18 bit bitsets yields a bitset that is still 18 bits. 10) The method isSet(-1) gives an NPE for an empty bitset, but returns false for a non-empty set. 11) The method set(-1) gives an NPE for an empty bitset, but does nothing for a non-empty set. 12) The method clear(-1) gives an NPE for an empty bitset, but does nothing for a non-empty set. 13) The method anySetBit() gives an NPE for an empty bitset, -1 would be more reasonable 14) The method anySetBit(-1) is equivalent to anySetBit(), but smaller args (-2, -3) do not throw an exception, but also do not necessarily return the same value as -1 15) The method or(FormatableBitSet) accepts null as an argument and treats it as if it represents an empty bitset. and(FormatableBitSet) will throw an AssertFailure, but xor(FormatableBitSet) throws an NPE 16) The methods or(FormatableBitSet) and and(FormatableBitSet) can be used with args that are smaller or larger. xor() requires the arg to have the same size as "this" 17) readExternalFromArray does not appear to have a write counterpart. Using it on something that is wtitten with writeExternal does not seem to work
        Hide
        Dyre Tjeldvoll added a comment -

        While looking at this class I noticed that some methods and parts of methods never get executed (according to the code coverage listing on the Wiki). I thought it would be good to have a unit test for the class before making bigger changes to it, so I have attached a junit-test proposal (FormatableBitSetTest.java).

        Currently I have placed the test in the unitTest directory, but since there is no junit infrastructure there or in the directory above, this may not be the best option. Suggestions?

        Thanks.

        Show
        Dyre Tjeldvoll added a comment - While looking at this class I noticed that some methods and parts of methods never get executed (according to the code coverage listing on the Wiki). I thought it would be good to have a unit test for the class before making bigger changes to it, so I have attached a junit-test proposal (FormatableBitSetTest.java). Currently I have placed the test in the unitTest directory, but since there is no junit infrastructure there or in the directory above, this may not be the best option. Suggestions? Thanks.
        Hide
        Knut Anders Hatlen added a comment -

        Thanks Dyre. The patch looks good. Committed revision 489053.

        Show
        Knut Anders Hatlen added a comment - Thanks Dyre. The patch looks good. Committed revision 489053.
        Hide
        Dyre Tjeldvoll added a comment -

        Removed an unnecessary cast to FormatableBitSet as well.

        Show
        Dyre Tjeldvoll added a comment - Removed an unnecessary cast to FormatableBitSet as well.
        Hide
        Dyre Tjeldvoll added a comment -

        Added a patch that removes some dead code. Reviews are appreciated.

        Show
        Dyre Tjeldvoll added a comment - Added a patch that removes some dead code. Reviews are appreciated.

          People

          • Assignee:
            Dyre Tjeldvoll
            Reporter:
            Dyre Tjeldvoll
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development