Lucene - Core
  1. Lucene - Core
  2. LUCENE-5519

Make queueDepth enforcing optional in TopNSearcher

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 4.7
    • Fix Version/s: 4.8, 6.0
    • Component/s: core/FSTs
    • Labels:
      None
    • Lucene Fields:
      New, Patch Available

      Description

      currently TopNSearcher enforces the maxQueueSize based on rejectedCount + topN. I have a usecase where I just simply don't know the exact limit and I am ok with a top N that is not 100% exact. Yet, if I don't specify the right upper limit for the queue size I get an assertion error when I run tests but the only workaround it to make the queue unbounded which looks odd while it would possibly work just fine. I think it's fair to add an option that just doesn't enforce the limit and if it shoudl be enforced we throw a real exception.

      1. LUCENE-5519.patch
        35 kB
        Simon Willnauer
      2. LUCENE-5519.patch
        30 kB
        Simon Willnauer
      3. LUCENE-5519.patch
        8 kB
        Simon Willnauer

        Activity

        Hide
        Simon Willnauer added a comment -

        here is a patch

        Show
        Simon Willnauer added a comment - here is a patch
        Hide
        Robert Muir added a comment -

        Another option might be to return a boolean in the MinResult indicating if the top-N is "complete" ?

        This way the caller could do whatever they want.

        But the current assert is bogus

        Show
        Robert Muir added a comment - Another option might be to return a boolean in the MinResult indicating if the top-N is "complete" ? This way the caller could do whatever they want. But the current assert is bogus
        Hide
        Simon Willnauer added a comment -

        yeah I like the `isPartial=True|False` idea... the caller can decide what todo and possibly assert there if needed. I will crank out another patch (I guess tomorrow)

        Show
        Simon Willnauer added a comment - yeah I like the `isPartial=True|False` idea... the caller can decide what todo and possibly assert there if needed. I will crank out another patch (I guess tomorrow)
        Hide
        Michael McCandless added a comment -

        +1 to return a boolean, just make sure the existing tests using this do the assert on that returned boolean; else, it means there's a serious bug (the search pruning is not admissible).

        Can we make the boolean isComplete? Ie, it returns true if no truncation happened?

        Show
        Michael McCandless added a comment - +1 to return a boolean, just make sure the existing tests using this do the assert on that returned boolean; else, it means there's a serious bug (the search pruning is not admissible). Can we make the boolean isComplete? Ie, it returns true if no truncation happened?
        Hide
        Simon Willnauer added a comment -

        next iteration... I added a new class TopResults that encodes the info if it's complete or not and added asssertions where appropriate.

        Show
        Simon Willnauer added a comment - next iteration... I added a new class TopResults that encodes the info if it's complete or not and added asssertions where appropriate.
        Hide
        Michael McCandless added a comment -

        Hmm, I don't like that we pre-allocate the full array[topN] here. Can we just use a List<X> for the input/output pairs? Or maybe just to back to List<MinResult>?

        It's also sort of strange to have spaceNeeded/isFull methods: it makes this class more like a queue and less like a set of final results. I'd prefer if it were more like TopDocs: its purpose is to simply deliver results. I think those methods/queue state tracking should be outside of that class.

        Show
        Michael McCandless added a comment - Hmm, I don't like that we pre-allocate the full array [topN] here. Can we just use a List<X> for the input/output pairs? Or maybe just to back to List<MinResult>? It's also sort of strange to have spaceNeeded/isFull methods: it makes this class more like a queue and less like a set of final results. I'd prefer if it were more like TopDocs: its purpose is to simply deliver results. I think those methods/queue state tracking should be outside of that class.
        Hide
        Simon Willnauer added a comment -

        here is a new patch moving closer to TopDocs again.I don't use an array in the TopResults since it's a pretty useless conversion and I implemented Iterable such that we can directly use it in a foreach loop. I think it's ready.

        Show
        Simon Willnauer added a comment - here is a new patch moving closer to TopDocs again.I don't use an array in the TopResults since it's a pretty useless conversion and I implemented Iterable such that we can directly use it in a foreach loop. I think it's ready.
        Hide
        Michael McCandless added a comment -

        +1, looks great. Thanks Simon!

        Show
        Michael McCandless added a comment - +1, looks great. Thanks Simon!
        Hide
        ASF subversion and git services added a comment -

        Commit 1576825 from Simon Willnauer in branch 'dev/trunk'
        [ https://svn.apache.org/r1576825 ]

        LUCENE-5519: Make queueDepth enforcing optional in TopNSearcher

        Show
        ASF subversion and git services added a comment - Commit 1576825 from Simon Willnauer in branch 'dev/trunk' [ https://svn.apache.org/r1576825 ] LUCENE-5519 : Make queueDepth enforcing optional in TopNSearcher
        Hide
        ASF subversion and git services added a comment -

        Commit 1576860 from Simon Willnauer in branch 'dev/branches/branch_4x'
        [ https://svn.apache.org/r1576860 ]

        LUCENE-5519: Make queueDepth enforcing optional in TopNSearcher

        Show
        ASF subversion and git services added a comment - Commit 1576860 from Simon Willnauer in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1576860 ] LUCENE-5519 : Make queueDepth enforcing optional in TopNSearcher
        Hide
        ASF subversion and git services added a comment -

        Commit 1577062 from Uwe Schindler in branch 'dev/trunk'
        [ https://svn.apache.org/r1577062 ]

        LUCENE-5519: Fix generics

        Show
        ASF subversion and git services added a comment - Commit 1577062 from Uwe Schindler in branch 'dev/trunk' [ https://svn.apache.org/r1577062 ] LUCENE-5519 : Fix generics
        Hide
        ASF subversion and git services added a comment -

        Commit 1577063 from Uwe Schindler in branch 'dev/branches/branch_4x'
        [ https://svn.apache.org/r1577063 ]

        Merged revision(s) 1577062 from lucene/dev/trunk:
        LUCENE-5519: Fix generics

        Show
        ASF subversion and git services added a comment - Commit 1577063 from Uwe Schindler in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1577063 ] Merged revision(s) 1577062 from lucene/dev/trunk: LUCENE-5519 : Fix generics
        Hide
        Uwe Schindler added a comment -

        Close issue after release of 4.8.0

        Show
        Uwe Schindler added a comment - Close issue after release of 4.8.0

          People

          • Assignee:
            Unassigned
            Reporter:
            Simon Willnauer
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development