Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-4687

Lazily initialize TermsEnum in BloomFilterPostingsFormat

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 4.0, 4.1
    • Fix Version/s: 4.2, 6.0
    • Component/s: core/codecs
    • Labels:
      None
    • Lucene Fields:
      New, Patch Available

      Description

      BloomFilteringPostingsFormat initializes its delegate TermsEnum directly inside the Terms#iterator() call which can be a pretty heavy operation if executed thousands of times. I suspect that bloom filter postings are mainly used for primary keys etc. which in turn is mostly a seekExact. Given that, most of the time we don't even need the delegate termsenum since most of the segments won't contain the key and the bloomfilter will likely return false from seekExact without consulting the delegate.

      1. LUCENE-4687.patch
        8 kB
        Simon Willnauer
      2. LUCENE-4687.patch
        8 kB
        Simon Willnauer

        Activity

        Hide
        thetaphi Uwe Schindler added a comment -

        Closed after release.

        Show
        thetaphi Uwe Schindler added a comment - Closed after release.
        Hide
        commit-tag-bot Commit Tag Bot added a comment -

        [branch_4x commit] Simon Willnauer
        http://svn.apache.org/viewvc?view=revision&revision=1434672

        LUCENE-4687: Lazily initialize TermsEnum in BloomFilterPostingsFormat

        Show
        commit-tag-bot Commit Tag Bot added a comment - [branch_4x commit] Simon Willnauer http://svn.apache.org/viewvc?view=revision&revision=1434672 LUCENE-4687 : Lazily initialize TermsEnum in BloomFilterPostingsFormat
        Hide
        commit-tag-bot Commit Tag Bot added a comment -

        [trunk commit] Simon Willnauer
        http://svn.apache.org/viewvc?view=revision&revision=1434664

        LUCENE-4687: Lazily initialize TermsEnum in BloomFilterPostingsFormat

        Show
        commit-tag-bot Commit Tag Bot added a comment - [trunk commit] Simon Willnauer http://svn.apache.org/viewvc?view=revision&revision=1434664 LUCENE-4687 : Lazily initialize TermsEnum in BloomFilterPostingsFormat
        Hide
        simonw Simon Willnauer added a comment -

        I will commit this tomorrow...

        Show
        simonw Simon Willnauer added a comment - I will commit this tomorrow...
        Hide
        mikemccand Michael McCandless added a comment -

        +1

        Show
        mikemccand Michael McCandless added a comment - +1
        Hide
        simonw Simon Willnauer added a comment -

        new patch making reset return void...

        Show
        simonw Simon Willnauer added a comment - new patch making reset return void...
        Hide
        simonw Simon Willnauer added a comment -

        can the reset() method return void?

        hmm not sure, I can try but its hard...

        Show
        simonw Simon Willnauer added a comment - can the reset() method return void? hmm not sure, I can try but its hard...
        Hide
        rcmuir Robert Muir added a comment -

        can the reset() method return void?

        Show
        rcmuir Robert Muir added a comment - can the reset() method return void?
        Hide
        simonw Simon Willnauer added a comment -

        here is a patch... I also removed the IOException from Terms#comparator() to make it consistent with TermsEnum#comparator()

        Show
        simonw Simon Willnauer added a comment - here is a patch... I also removed the IOException from Terms#comparator() to make it consistent with TermsEnum#comparator()

          People

          • Assignee:
            simonw Simon Willnauer
            Reporter:
            simonw Simon Willnauer
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development