Lucene - Core
  1. Lucene - Core
  2. LUCENE-2261

configurable MultiTermQuery TopTermsScoringBooleanRewrite pq size

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.0-ALPHA
    • Component/s: core/search
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      MultiTermQuery has a TopTermsScoringBooleanRewrite, that uses a priority queue to expand the query to the top-N terms.

      currently N is hardcoded at BooleanQuery.getMaxClauseCount(), but it would be nice to be able to set this for top-N MultiTermQueries: e.g. expand a fuzzy query to at most only the 50 closest terms.

      at a glance it seems one way would be to expose TopTermsScoringBooleanRewrite (it is private right now) and add a ctor to it, so a MultiTermQuery can instantiate one with its own limit.

      1. LUCENE-2261.patch
        10 kB
        Robert Muir
      2. LUCENE-2261.patch
        10 kB
        Robert Muir
      3. LUCENE-2261.patch
        10 kB
        Robert Muir
      4. LUCENE-2261.patch
        4 kB
        Robert Muir
      5. LUCENE-2261.patch
        3 kB
        Robert Muir

        Issue Links

          Activity

          Hide
          Robert Muir added a comment -

          Committed revision 912311.

          Show
          Robert Muir added a comment - Committed revision 912311.
          Hide
          Robert Muir added a comment -

          will commit this in a day or two if no one objects.

          Show
          Robert Muir added a comment - will commit this in a day or two if no one objects.
          Hide
          Robert Muir added a comment -

          There is only one thing unrelated to that issue: It makes no sense to declare IllegalArgExceptions as they are unchecked. I would remove them, else the compiler does.

          Agreed, here is a patch with this cleanup, and also removes an unused import from MTQ

          Show
          Robert Muir added a comment - There is only one thing unrelated to that issue: It makes no sense to declare IllegalArgExceptions as they are unchecked. I would remove them, else the compiler does. Agreed, here is a patch with this cleanup, and also removes an unused import from MTQ
          Hide
          Uwe Schindler added a comment -

          Hi Robert, patch looks good, all tests pass, nothing to complain from the MTQ police

          There is only one thing unrelated to that issue: It makes no sense to declare IllegalArgExceptions as they are unchecked. I would remove them, else the compiler does.

          Show
          Uwe Schindler added a comment - Hi Robert, patch looks good, all tests pass, nothing to complain from the MTQ police There is only one thing unrelated to that issue: It makes no sense to declare IllegalArgExceptions as they are unchecked. I would remove them, else the compiler does.
          Hide
          Robert Muir added a comment -

          Looks good when gaining a first insight. I have not tried the patch, will do soon.

          Uwe lemme know if everything is ok, once multitermquery policeman is happy I will look at proceeding to commit

          Show
          Robert Muir added a comment - Looks good when gaining a first insight. I have not tried the patch, will do soon. Uwe lemme know if everything is ok, once multitermquery policeman is happy I will look at proceeding to commit
          Hide
          Uwe Schindler added a comment -

          Looks good when gaining a first insight. I have not tried the patch, will do soon.

          Show
          Uwe Schindler added a comment - Looks good when gaining a first insight. I have not tried the patch, will do soon.
          Hide
          Robert Muir added a comment -

          sorry, i created a javadoc warning, (TOP_TERMS singleton was referenced in the top of mtq javadocs). here is the fix

          Show
          Robert Muir added a comment - sorry, i created a javadoc warning, (TOP_TERMS singleton was referenced in the top of mtq javadocs). here is the fix
          Hide
          Robert Muir added a comment -

          here is a patch with uwe's suggestions. now that fuzzyquery supports the param via ctor, the singleton TopTerms is no longer used so i removed it.

          Show
          Robert Muir added a comment - here is a patch with uwe's suggestions. now that fuzzyquery supports the param via ctor, the singleton TopTerms is no longer used so i removed it.
          Hide
          Robert Muir added a comment -

          the previous patch was wrong for readResolve, here is a fix.

          Show
          Robert Muir added a comment - the previous patch was wrong for readResolve, here is a fix.
          Hide
          Uwe Schindler added a comment -

          Patch looks good, some things because of serializable:

          • The readResove method must go to the singleton constant, which should also throw UOE when modified
          • euquals / hashcode is neaded for the rewritemode, else FuzzyQuery & Co would no longer compare

          It could be solved by doing like for AutoRewrite and its unmodifiable constant. I know: Queries are a pain because of Serializable.

          +1 on adding a param to FuzzyQuery ctor

          Show
          Uwe Schindler added a comment - Patch looks good, some things because of serializable: The readResove method must go to the singleton constant, which should also throw UOE when modified euquals / hashcode is neaded for the rewritemode, else FuzzyQuery & Co would no longer compare It could be solved by doing like for AutoRewrite and its unmodifiable constant. I know: Queries are a pain because of Serializable. +1 on adding a param to FuzzyQuery ctor
          Hide
          Robert Muir added a comment -

          attached is a patch. i changed FuzzyQueries pq test to use this so it does not have to do the try-finally thing/BooleanQuery.maxClauseCount

          Show
          Robert Muir added a comment - attached is a patch. i changed FuzzyQueries pq test to use this so it does not have to do the try-finally thing/BooleanQuery.maxClauseCount

            People

            • Assignee:
              Robert Muir
              Reporter:
              Robert Muir
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development