Lucene - Core
  1. Lucene - Core
  2. LUCENE-1506

Adding FilteredDocIdSet and FilteredDocIdSetIterator

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 2.4
    • Fix Version/s: 2.9
    • Component/s: core/search
    • Labels:
      None
    • Lucene Fields:
      New, Patch Available

      Description

      Adding 2 convenience classes: FilteredDocIdSet and FilteredDocIDSetIterator.

      1. filteredDocidset.txt
        2 kB
        John Wang
      2. filteredDocidset2.txt
        7 kB
        John Wang
      3. LUCENE-1506.patch
        10 kB
        Michael McCandless

        Activity

        Hide
        Michael McCandless added a comment -

        Can't this functionality be achieved via a normal Filter (and
        ChainedFilter if you need to AND two Filters together)? Ie, why
        introduce a new interface (with the "match" method)?

        Show
        Michael McCandless added a comment - Can't this functionality be achieved via a normal Filter (and ChainedFilter if you need to AND two Filters together)? Ie, why introduce a new interface (with the "match" method)?
        Hide
        John Wang added a comment -

        Filter calculates a DocSet given an IndexReader. Imagine a large index, and the logic to calculate whether it is in the set is non-trivial, so building this DocSet can be expensive.

        So in the case where the driving query produces a very small result set, the validation can be performed only on the small set via the match call.

        Yes, in terms of functionality, one can do this with a filter, but it is wasteful to perform the validation calculation on the entire index where the candidates to be in the hits set is small.

        Show
        John Wang added a comment - Filter calculates a DocSet given an IndexReader. Imagine a large index, and the logic to calculate whether it is in the set is non-trivial, so building this DocSet can be expensive. So in the case where the driving query produces a very small result set, the validation can be performed only on the small set via the match call. Yes, in terms of functionality, one can do this with a filter, but it is wasteful to perform the validation calculation on the entire index where the candidates to be in the hits set is small.
        Hide
        Michael McCandless added a comment -

        OK, I see. That is an important difference; I think it makes sense to add this. Could you add javadocs & a unit test? Thanks John.

        Show
        Michael McCandless added a comment - OK, I see. That is an important difference; I think it makes sense to add this. Could you add javadocs & a unit test? Thanks John.
        Hide
        John Wang added a comment -

        sure, will work on that.

        Show
        John Wang added a comment - sure, will work on that.
        Hide
        John Wang added a comment -

        javadoc and unit test added

        Show
        John Wang added a comment - javadoc and unit test added
        Hide
        Michael McCandless added a comment -

        Thanks John! I made a few tweaks ("downgraded" to Java 1.4, expanded javadocs, fixed whitespace, etc.). I think it's ready to commit. I'll wait a day or two.

        Show
        Michael McCandless added a comment - Thanks John! I made a few tweaks ("downgraded" to Java 1.4, expanded javadocs, fixed whitespace, etc.). I think it's ready to commit. I'll wait a day or two.
        Hide
        John Wang added a comment -

        Thanks Michael!

        Show
        John Wang added a comment - Thanks Michael!
        Hide
        Michael McCandless added a comment -

        Committed revision 740361. Thanks John!

        Show
        Michael McCandless added a comment - Committed revision 740361. Thanks John!

          People

          • Assignee:
            Michael McCandless
            Reporter:
            John Wang
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Time Tracking

              Estimated:
              Original Estimate - 1h
              1h
              Remaining:
              Remaining Estimate - 1h
              1h
              Logged:
              Time Spent - Not Specified
              Not Specified

                Development