Solr
  1. Solr
  2. SOLR-875

Consolidate Solr's and Lucene's OpenBitSet classes

    Details

    • Type: Task Task
    • Status: Resolved
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      Currently there are two versions of OpenBitSet and BitUtil in Solr and Lucene.

      We should only have one version of these classes in Lucene, that Solr should use.

      Tasks here:

      • Merge different versions into Lucene
      • Make Solr classes use/extend the classes in Lucene (we need to keep the Solr ones for backwards-compatibility)
      • Deprecate the classes in Solr
      • Change all references in Solr to use the classes in Lucene

      One difficulty here is Solr's BitSetIterator vs. Lucene's OpenBitSetIterator. Both have a next() method, however one returns an int (BitSetIterator), the other one returns a boolean and offers a doc() method to get the doc id. So I can't make BitSetIterator extend OpenBitSetIterator. There are not many places in Solr's core that use BitSetIterator, so we could simply change e.g. search/BitDocSet.java to use OpenBitSetIterator. This would however require to change the call to next() into two calls to next() and doc(). I wonder if this would be a noticeable performance hit?

      We could of course also leave both iterators and only merge OpenBitSet and BitUtil, but I'd prefer to only have one iterator, because they basically do exactly the same.

      1. solr-875.patch
        52 kB
        Michael Busch

        Issue Links

          Activity

          Hide
          Michael Busch added a comment -

          This would however require to change the call to next() into two calls to next() and doc(). I wonder if this would be a noticeable performance hit?

          I can also add a

          int nextDoc()
          

          method to OpenBitSetIterator, that Solr can use. But it would have to set curDocId before returning the value. So still a tiny overhead, but probably less than two method calls.

          Show
          Michael Busch added a comment - This would however require to change the call to next() into two calls to next() and doc(). I wonder if this would be a noticeable performance hit? I can also add a int nextDoc() method to OpenBitSetIterator, that Solr can use. But it would have to set curDocId before returning the value. So still a tiny overhead, but probably less than two method calls.
          Hide
          Michael Busch added a comment -

          This patch:

          • deprecates OpenBitSet, BitUtil and BitSetIterator in org.apache.solr.util
          • updates all places in Solr's core that reference these classes to use the ones in Lucene

          All tests pass and this patch is backwards-compatible.
          Note that this patch only compiles against the current Lucene trunk + LUCENE-1467.

          Show
          Michael Busch added a comment - This patch: deprecates OpenBitSet, BitUtil and BitSetIterator in org.apache.solr.util updates all places in Solr's core that reference these classes to use the ones in Lucene All tests pass and this patch is backwards-compatible. Note that this patch only compiles against the current Lucene trunk + LUCENE-1467 .
          Hide
          Michael Busch added a comment -

          Note that Yonik recently (August '08) commit this to Lucene's OpenBitSet:

          URL: http://svn.apache.org/viewvc?rev=690302&view=rev
          Log:
          fix OpenBitSet.hashCode rotate
          
          Modified:
              lucene/java/trunk/src/java/org/apache/lucene/util/OpenBitSet.java
          
          Modified: lucene/java/trunk/src/java/org/apache/lucene/util/OpenBitSet.java
          URL: http://svn.apache.org/viewvc/lucene/java/trunk/src/java/org/apache/lucene/util/OpenBitSet.java?rev=690302&r1=690301&r2=690302&view=diff
          ==============================================================================
          --- lucene/java/trunk/src/java/org/apache/lucene/util/OpenBitSet.java (original)
          +++ lucene/java/trunk/src/java/org/apache/lucene/util/OpenBitSet.java Fri Aug 29 08:33:44 2008
          @@ -763,7 +763,7 @@
                 long h = 0x98761234;  // something non-zero for length==0
                 for (int i = bits.length; --i>=0;) {
                 h ^= bits[i];
          -      h = (h << 1) | (h >>> 31); // rotate left
          +      h = (h << 1) | (h >>> 63); // rotate left
               }
               return (int)((h>>32) ^ h);  // fold leftmost bits into right
             }
          

          This fix wasn't committed to Solr yet.

          Show
          Michael Busch added a comment - Note that Yonik recently (August '08) commit this to Lucene's OpenBitSet: URL: http: //svn.apache.org/viewvc?rev=690302&view=rev Log: fix OpenBitSet.hashCode rotate Modified: lucene/java/trunk/src/java/org/apache/lucene/util/OpenBitSet.java Modified: lucene/java/trunk/src/java/org/apache/lucene/util/OpenBitSet.java URL: http: //svn.apache.org/viewvc/lucene/java/trunk/src/java/org/apache/lucene/util/OpenBitSet.java?rev=690302&r1=690301&r2=690302&view=diff ============================================================================== --- lucene/java/trunk/src/java/org/apache/lucene/util/OpenBitSet.java (original) +++ lucene/java/trunk/src/java/org/apache/lucene/util/OpenBitSet.java Fri Aug 29 08:33:44 2008 @@ -763,7 +763,7 @@ long h = 0x98761234; // something non-zero for length==0 for ( int i = bits.length; --i>=0;) { h ^= bits[i]; - h = (h << 1) | (h >>> 31); // rotate left + h = (h << 1) | (h >>> 63); // rotate left } return ( int )((h>>32) ^ h); // fold leftmost bits into right } This fix wasn't committed to Solr yet.
          Hide
          Michael Busch added a comment -

          I just committed and resolved LUCENE-1467.

          Show
          Michael Busch added a comment - I just committed and resolved LUCENE-1467 .
          Hide
          Grant Ingersoll added a comment -

          Committed revision 723994.

          Had a slight change to the patch for the new UninvertedField faceting feature that was recently added.

          Thanks, Michael!

          Show
          Grant Ingersoll added a comment - Committed revision 723994. Had a slight change to the patch for the new UninvertedField faceting feature that was recently added. Thanks, Michael!
          Hide
          Michael Busch added a comment -

          Had a slight change to the patch for the new UninvertedField faceting feature that was recently added.

          Great, thanks Grant!

          Show
          Michael Busch added a comment - Had a slight change to the patch for the new UninvertedField faceting feature that was recently added. Great, thanks Grant!

            People

            • Assignee:
              Unassigned
              Reporter:
              Michael Busch
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development