Lucene - Core
  1. Lucene - Core
  2. LUCENE-4687

Lazily initialize TermsEnum in BloomFilterPostingsFormat

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 4.0, 4.1
    • Fix Version/s: 4.2, 6.0
    • Component/s: core/codecs
    • Labels:
      None
    • Lucene Fields:
      New, Patch Available

      Description

      BloomFilteringPostingsFormat initializes its delegate TermsEnum directly inside the Terms#iterator() call which can be a pretty heavy operation if executed thousands of times. I suspect that bloom filter postings are mainly used for primary keys etc. which in turn is mostly a seekExact. Given that, most of the time we don't even need the delegate termsenum since most of the segments won't contain the key and the bloomfilter will likely return false from seekExact without consulting the delegate.

      1. LUCENE-4687.patch
        8 kB
        Simon Willnauer
      2. LUCENE-4687.patch
        8 kB
        Simon Willnauer

        Activity

        Hide
        Simon Willnauer added a comment -

        here is a patch... I also removed the IOException from Terms#comparator() to make it consistent with TermsEnum#comparator()

        Show
        Simon Willnauer added a comment - here is a patch... I also removed the IOException from Terms#comparator() to make it consistent with TermsEnum#comparator()
        Hide
        Robert Muir added a comment -

        can the reset() method return void?

        Show
        Robert Muir added a comment - can the reset() method return void?
        Hide
        Simon Willnauer added a comment -

        can the reset() method return void?

        hmm not sure, I can try but its hard...

        Show
        Simon Willnauer added a comment - can the reset() method return void? hmm not sure, I can try but its hard...
        Hide
        Simon Willnauer added a comment -

        new patch making reset return void...

        Show
        Simon Willnauer added a comment - new patch making reset return void...
        Hide
        Michael McCandless added a comment -

        +1

        Show
        Michael McCandless added a comment - +1
        Hide
        Simon Willnauer added a comment -

        I will commit this tomorrow...

        Show
        Simon Willnauer added a comment - I will commit this tomorrow...
        Hide
        Commit Tag Bot added a comment -

        [trunk commit] Simon Willnauer
        http://svn.apache.org/viewvc?view=revision&revision=1434664

        LUCENE-4687: Lazily initialize TermsEnum in BloomFilterPostingsFormat

        Show
        Commit Tag Bot added a comment - [trunk commit] Simon Willnauer http://svn.apache.org/viewvc?view=revision&revision=1434664 LUCENE-4687 : Lazily initialize TermsEnum in BloomFilterPostingsFormat
        Hide
        Commit Tag Bot added a comment -

        [branch_4x commit] Simon Willnauer
        http://svn.apache.org/viewvc?view=revision&revision=1434672

        LUCENE-4687: Lazily initialize TermsEnum in BloomFilterPostingsFormat

        Show
        Commit Tag Bot added a comment - [branch_4x commit] Simon Willnauer http://svn.apache.org/viewvc?view=revision&revision=1434672 LUCENE-4687 : Lazily initialize TermsEnum in BloomFilterPostingsFormat
        Hide
        Uwe Schindler added a comment -

        Closed after release.

        Show
        Uwe Schindler added a comment - Closed after release.

          People

          • Assignee:
            Simon Willnauer
            Reporter:
            Simon Willnauer
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development