Uploaded image for project: 'Calcite'
  1. Calcite
  2. CALCITE-460

Add ImmutableBitSet and replace uses of BitSet

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.0.0-incubating
    • Component/s: None
    • Labels:
      None

      Description

      Calcite makes heavy use of bit sets. java.util.BitSet has a convenient API but (a) is mutable, (b) uses quite a lot of memory, (c) is not iterable.

      Propose to implement ImmutableBitSet, which addresses those deficiencies.

        Activity

        Hide
        vladimirsitnikov Vladimir Sitnikov added a comment -

        JDK BitSet is just

            /**
             * The internal field corresponding to the serialField "bits".
             */
            private long[] words;
        
            /**
             * The number of words in the logical size of this BitSet.
             */
            private transient int wordsInUse = 0;
        
            /**
             * Whether the size of "words" is user-specified.  If so, we assume
             * the user knows what he's doing and try harder to preserve it.
             */
            private transient boolean sizeIsSticky = false;
        

        How do you plan to make it more compact?

        Show
        vladimirsitnikov Vladimir Sitnikov added a comment - JDK BitSet is just /** * The internal field corresponding to the serialField "bits" . */ private long [] words; /** * The number of words in the logical size of this BitSet. */ private transient int wordsInUse = 0; /** * Whether the size of "words" is user-specified. If so, we assume * the user knows what he's doing and try harder to preserve it. */ private transient boolean sizeIsSticky = false ; How do you plan to make it more compact?
        Hide
        julianhyde Julian Hyde added a comment -

        You can get rid of wordsInUse and sizeIsSticky if the set is immutable. And there could be a variant that use just a single long if the highest bit is less than 64.

        Show
        julianhyde Julian Hyde added a comment - You can get rid of wordsInUse and sizeIsSticky if the set is immutable. And there could be a variant that use just a single long if the highest bit is less than 64.
        Hide
        vladimirsitnikov Vladimir Sitnikov added a comment - - edited

        I have no preference, however this is a doubtful change: it could result in polymorphic dispatch of bitSet methods since JVM would face with a lot of different implementations.

        Show
        vladimirsitnikov Vladimir Sitnikov added a comment - - edited I have no preference, however this is a doubtful change: it could result in polymorphic dispatch of bitSet methods since JVM would face with a lot of different implementations.
        Hide
        julianhyde Julian Hyde added a comment - - edited

        Agreed. I have been worrying about dispatch too. I have only one implementation for now, ImmutableBitSet, which has a long[] words field. The big breaking change is to switch APIs from BitSet to ImmutableBitSet, and that can only be done now (switch over to 1.0).

        We can add an InlineImmutableBitSet sub-class in the future if it is proven beneficial.

        Show
        julianhyde Julian Hyde added a comment - - edited Agreed. I have been worrying about dispatch too. I have only one implementation for now, ImmutableBitSet , which has a long[] words field. The big breaking change is to switch APIs from BitSet to ImmutableBitSet , and that can only be done now (switch over to 1.0). We can add an InlineImmutableBitSet sub-class in the future if it is proven beneficial.
        Hide
        vladimirsitnikov Vladimir Sitnikov added a comment -

        Agreed.
        So using ImmutableBitSet just for immutability and breaking the API works for me.

        Show
        vladimirsitnikov Vladimir Sitnikov added a comment - Agreed. So using ImmutableBitSet just for immutability and breaking the API works for me.
        Hide
        julianhyde Julian Hyde added a comment -
        Show
        julianhyde Julian Hyde added a comment - I have checked in my changes as https://github.com/julianhyde/incubator-calcite/commit/47ca054761d9445095896cb77e7713174f0baa14 . Comments welcome.
        Hide
        vladimirsitnikov Vladimir Sitnikov added a comment -

        That's a big change.

        +    if (builder.wouldEqual(bitSet)) {
        +      return bitSet;
        +    }
        +    return builder.build();
        

        What if Builder(ImmutableBitSet) remembers the input bitset and returns that in the build method if the result is the same?

        Show
        vladimirsitnikov Vladimir Sitnikov added a comment - That's a big change. + if (builder.wouldEqual(bitSet)) { + return bitSet; + } + return builder.build(); What if Builder(ImmutableBitSet) remembers the input bitset and returns that in the build method if the result is the same?
        Hide
        julianhyde Julian Hyde added a comment -

        Take a look at how ImmutableBitSet is used. There are 2 distinct modes of operation: for one set operation, or a sequence of single-bit operations. I figured that the check is worthwhile for the former but not the latter, so I put the check into the ImmutableBitSet method (e.g. union) rather than the Builder. But I don't feel strongly about it.

        Show
        julianhyde Julian Hyde added a comment - Take a look at how ImmutableBitSet is used. There are 2 distinct modes of operation: for one set operation, or a sequence of single-bit operations. I figured that the check is worthwhile for the former but not the latter, so I put the check into the ImmutableBitSet method (e.g. union) rather than the Builder. But I don't feel strongly about it.
        Hide
        julianhyde Julian Hyde added a comment -

        I'm adding

            public ImmutableBitSet build(ImmutableBitSet bitSet) {
              if (wouldEqual(bitSet)) {
                return bitSet;
              }
              return build();
            }
        

        using it in several places, and making wouldEqual private. Keeps the onus on the client rather than Builder to remember the original bit set.

        Show
        julianhyde Julian Hyde added a comment - I'm adding public ImmutableBitSet build(ImmutableBitSet bitSet) { if (wouldEqual(bitSet)) { return bitSet; } return build(); } using it in several places, and making wouldEqual private. Keeps the onus on the client rather than Builder to remember the original bit set.
        Show
        julianhyde Julian Hyde added a comment - Fixed in http://git-wip-us.apache.org/repos/asf/incubator-calcite/commit/b9d8de38d2aa4ee6791aec42ffe2adb58942060a .
        Hide
        julianhyde Julian Hyde added a comment -

        Closing now that 1.0.0-incubating has been released.

        Show
        julianhyde Julian Hyde added a comment - Closing now that 1.0.0-incubating has been released.

          People

          • Assignee:
            julianhyde Julian Hyde
            Reporter:
            julianhyde Julian Hyde
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development