Details

    • Type: Task Task
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 5.3, 6.0
    • Component/s: None
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      In the same spirit as LUCENE-6531 for the PhraseQuery, we should make BooleanQuery immutable.

      The plan is the following:

      • create BooleanQuery.Builder with the same setters as BooleanQuery today (except setBoost) and a build() method that returns a BooleanQuery
      • remove setters from BooleanQuery (except setBoost)

      I would also like to add some static utility methods for common use-cases of this query, for instance:

      • static BooleanQuery disjunction(Query... queries) to create a disjunction
      • static BooleanQuery conjunction(Query... queries) to create a conjunction
      • static BooleanQuery filtered(Query query, Query... filters) to create a filtered query

      Hopefully this will help keep tests not too verbose, and the latter will also help with the FilteredQuery derecation/removal.

      1. LUCENE-6570.patch
        552 kB
        Adrien Grand

        Issue Links

          Activity

          Hide
          Robert Muir added a comment -

          static BooleanQuery disjunction(Query... queries) to create a disjunction
          static BooleanQuery conjunction(Query... queries) to create a conjunction
          static BooleanQuery filtered(Query query, Query... filters) to create a filtered query

          Can this be deferred to another issue? I'm really not a fan of such static methods. They are unnatural, almost always named badly, and ultimately result in making code just harder to read.

          less lines of code != easier to read.

          Show
          Robert Muir added a comment - static BooleanQuery disjunction(Query... queries) to create a disjunction static BooleanQuery conjunction(Query... queries) to create a conjunction static BooleanQuery filtered(Query query, Query... filters) to create a filtered query Can this be deferred to another issue? I'm really not a fan of such static methods. They are unnatural, almost always named badly, and ultimately result in making code just harder to read. less lines of code != easier to read.
          Hide
          Adrien Grand added a comment -

          I can defer, the reason I was trying to avoid it is that it will make the migration slightly harder.

          less lines of code != easier to read.

          It's different but not completely unrelated. Also my main point was about making BooleanQuery easier to use for common use-cases.

          Show
          Adrien Grand added a comment - I can defer, the reason I was trying to avoid it is that it will make the migration slightly harder. less lines of code != easier to read. It's different but not completely unrelated. Also my main point was about making BooleanQuery easier to use for common use-cases.
          Hide
          Adrien Grand added a comment -

          Here is a patch. The main changes went to multi-term queries, DrillDownQuery and BoostingQuery:

          • Multi-term queries can't use BooleanQuery as a builder anymore since it's immutable so I had to change the API to allow to use something that is not a query as a builder.
          • DrillDownQuery was also using BooleanQuery as a buffer so I had to refactor it to use a List instead and only build the BooleanQuery when necessary.
          • BoostingQuery couldn't extend BooleanQuery anymore, so now it has its own Weight/Scorer implementation.

          Here is what the TermCollectingRewrite API looks like now:

          abstract class TermCollectingRewrite<B> extends MultiTermQuery.RewriteMethod {
          
            /** Return a suitable builder for the top-level Query for holding all expanded terms. */
            protected abstract B getTopLevelBuilder() throws IOException;
          
            /** Finalize the creation of the query from the builder. */
            protected abstract Query build(B builder);
          }
          
          Show
          Adrien Grand added a comment - Here is a patch. The main changes went to multi-term queries, DrillDownQuery and BoostingQuery: Multi-term queries can't use BooleanQuery as a builder anymore since it's immutable so I had to change the API to allow to use something that is not a query as a builder. DrillDownQuery was also using BooleanQuery as a buffer so I had to refactor it to use a List instead and only build the BooleanQuery when necessary. BoostingQuery couldn't extend BooleanQuery anymore, so now it has its own Weight/Scorer implementation. Here is what the TermCollectingRewrite API looks like now: abstract class TermCollectingRewrite<B> extends MultiTermQuery.RewriteMethod { /** Return a suitable builder for the top-level Query for holding all expanded terms. */ protected abstract B getTopLevelBuilder() throws IOException; /** Finalize the creation of the query from the builder. */ protected abstract Query build(B builder); }
          Hide
          Uwe Schindler added a comment -

          MTQ changes look OK, +1

          Show
          Uwe Schindler added a comment - MTQ changes look OK, +1
          Hide
          Cao Manh Dat added a comment - - edited

          Wow, It is a lot of work Adrien Grand. I notice that you change the constructors of BooleanQuery to private so can we do same thing with PhraseQuery in LUCENE-6575? We can remove a lot of code there

          Show
          Cao Manh Dat added a comment - - edited Wow, It is a lot of work Adrien Grand . I notice that you change the constructors of BooleanQuery to private so can we do same thing with PhraseQuery in LUCENE-6575 ? We can remove a lot of code there
          Hide
          ASF subversion and git services added a comment -

          Commit 1686145 from Adrien Grand in branch 'dev/trunk'
          [ https://svn.apache.org/r1686145 ]

          LUCENE-6570: Make BooleanQuery immutable.

          Show
          ASF subversion and git services added a comment - Commit 1686145 from Adrien Grand in branch 'dev/trunk' [ https://svn.apache.org/r1686145 ] LUCENE-6570 : Make BooleanQuery immutable.
          Hide
          ASF subversion and git services added a comment -

          Commit 1686150 from Adrien Grand in branch 'dev/trunk'
          [ https://svn.apache.org/r1686150 ]

          LUCENE-6570: Make BooleanQuery.Builder.add(Query,Occur) return this, like other methods.

          Show
          ASF subversion and git services added a comment - Commit 1686150 from Adrien Grand in branch 'dev/trunk' [ https://svn.apache.org/r1686150 ] LUCENE-6570 : Make BooleanQuery.Builder.add(Query,Occur) return this, like other methods.
          Hide
          Hoss Man added a comment -

          I'm confused about the status of this issue.

          it's currently marked "Resolved" in 6.0, and trunk has the new API with all of the formerly public constructors for BooleanQuery completely deleted – but i don't see any commits/patches/mentions to backporting the builder & marking these constructors deprecated in the 5x branch. (let alone any mention in MIGRATE.txt)


          I'm also not sure how i feel about the defensive cloning of (sub) queries here.

          It's a significant enough difference from how constructing BooleanQueries
          worked in the past that at an absolute minimum the javadocs for the builder should be crystal clear that a clone is happening under the covers – but it also means that users who have existing code like this...

          public final static REUSED_FILTER = buildReallyMassiveQueryForReuse();
          ...
          loop {
            BooleanQuery bq = new BooleanQuery();
            bq.add(REUSED_FILTER, MUST);
            bq.add(buildQueryFromInput, MUST);
            ...
            doSomeSearch(bq);
          }
          

          ...are now going to be seeing REUSED_FILTER get cloned for every query. Again, at a minimum, this needs to be heavily spelled out in the MIGRATE and other docs, but personally i think it's going to far and we shouldn't be making these defensive clones.

          Show
          Hoss Man added a comment - I'm confused about the status of this issue. it's currently marked "Resolved" in 6.0, and trunk has the new API with all of the formerly public constructors for BooleanQuery completely deleted – but i don't see any commits/patches/mentions to backporting the builder & marking these constructors deprecated in the 5x branch. (let alone any mention in MIGRATE.txt) I'm also not sure how i feel about the defensive cloning of (sub) queries here. It's a significant enough difference from how constructing BooleanQueries worked in the past that at an absolute minimum the javadocs for the builder should be crystal clear that a clone is happening under the covers – but it also means that users who have existing code like this... public final static REUSED_FILTER = buildReallyMassiveQueryForReuse(); ... loop { BooleanQuery bq = new BooleanQuery(); bq.add(REUSED_FILTER, MUST); bq.add(buildQueryFromInput, MUST); ... doSomeSearch(bq); } ...are now going to be seeing REUSED_FILTER get cloned for every query. Again, at a minimum, this needs to be heavily spelled out in the MIGRATE and other docs, but personally i think it's going to far and we shouldn't be making these defensive clones.
          Hide
          Adrien Grand added a comment -

          Agreed that MIGRATE.txt needs more information about the API changes to PhraseQuery and BooleanQuery, I'll work on it.

          Are you suggesting that this new builder should be backported to 5.x and that the setters on BooleanQuery should be marked as deprecated? I did not consider this option because it would make BooleanQuery have the API of an immutable query while nothing would guarantee that it cannot be modified since we would need to keep the old deprecated setters. So I just made it a breaking change for 6.0.

          I am willing to document the fact that we clone sub queries, but I am on the fence about removing it, since without defensively cloning, the BooleanQuery would still be mutable while this issue is about making sure it cannot change. For instance, consider the following sequence of operations:

          TermQuery tq1 = new TermQuery(new Term("field", "value"));
          BooleanQuery bq1 = new BooleanQuery.Builder()
              .add(tq1, Occur.MUST)
              .build();
          
          TermQuery tq2 = new TermQuery(new Term("field", "value"));
          BooleanQuery bq2 = new BooleanQuery.Builder()
              .add(tq2, Occur.MUST)
              .build();
          
          assertEquals(bq1, bq2); // passes
          tq1.setBoost(2f);
          assertEquals(bq1, bq2); // fails if we did not clone because the boost of the sub query changed
          

          One motivation behind this change is to be able to enable the query cache by default in IndexSearcher (currently off), which we can only do if queries can reliably be used as cache keys.

          The cloning of the sub queries is also important for LUCENE-6305. It fixes BooleanQuery to ignore clause order by putting queries into a multiset, but if boosts of sub queries can change after the boolean query has been built then again the fix would not be applicable.

          Show
          Adrien Grand added a comment - Agreed that MIGRATE.txt needs more information about the API changes to PhraseQuery and BooleanQuery, I'll work on it. Are you suggesting that this new builder should be backported to 5.x and that the setters on BooleanQuery should be marked as deprecated? I did not consider this option because it would make BooleanQuery have the API of an immutable query while nothing would guarantee that it cannot be modified since we would need to keep the old deprecated setters. So I just made it a breaking change for 6.0. I am willing to document the fact that we clone sub queries, but I am on the fence about removing it, since without defensively cloning, the BooleanQuery would still be mutable while this issue is about making sure it cannot change. For instance, consider the following sequence of operations: TermQuery tq1 = new TermQuery( new Term( "field" , "value" )); BooleanQuery bq1 = new BooleanQuery.Builder() .add(tq1, Occur.MUST) .build(); TermQuery tq2 = new TermQuery( new Term( "field" , "value" )); BooleanQuery bq2 = new BooleanQuery.Builder() .add(tq2, Occur.MUST) .build(); assertEquals(bq1, bq2); // passes tq1.setBoost(2f); assertEquals(bq1, bq2); // fails if we did not clone because the boost of the sub query changed One motivation behind this change is to be able to enable the query cache by default in IndexSearcher (currently off), which we can only do if queries can reliably be used as cache keys. The cloning of the sub queries is also important for LUCENE-6305 . It fixes BooleanQuery to ignore clause order by putting queries into a multiset, but if boosts of sub queries can change after the boolean query has been built then again the fix would not be applicable.
          Hide
          Hoss Man added a comment -

          Are you suggesting that this new builder should be backported to 5.x and that the setters on BooleanQuery should be marked as deprecated?

          yes.

          I did not consider this option because it would make BooleanQuery have the API of an immutable query while nothing would guarantee that it cannot be modified since we would need to keep the old deprecated setters. So I just made it a breaking change for 6.0.

          I think it would be good to get the API out there for 5.x users, so they can start using it. Even if the underlying BooleanQuery objects themselves are not immutable, encouraging people to use the Builder pattern to create them, and discouraging them from expecting to be able to mutate the BooleanQuery objects seems like a good idea to get out there as early as possible.

          I am willing to document the fact that we clone sub queries, but I am on the fence about removing it, since without defensively cloning, the BooleanQuery would still be mutable while this issue is about making sure it cannot change.

          There's a difference between saying the BooleanQuery is immutable and cannot change and saying the queries wrapped by the BooleanQuery are cloned & no longer accessible and cannot cahnge – that's something that wasn't clear from your issue description until I looked closer at your commit.

          By comparison, "Collections.singletonList(T o)" is documented to return an immutable list, but it doesn't clone 'o' – it doesn't try to prevent you from calling o.changeState() after you construct that list.

          For instance, consider the following sequence of operations: ...

          The change in run time behavior of code like that is exactly why this change scares me – it's very similar to what users may have come to expect from code like my example if they want to do something like call REUSED_FILTER.setBost(foo) to dynamically tweak relevancy of many queries (all at once) at runtime.

          One motivation behind this change is to be able to enable the query cache by default in IndexSearcher (currently off), which we can only do if queries can reliably be used as cache keys.

          I can appreciate that goal, but i don't think it's ever going to be feasible to turn that on by default in the truly generic case of any arbitrary lucene application, where people might have custom Query impls.

          Bottom line: i think there is a decent risk that this under the covers, no way to turn off, cloning of sub-queries will have some serious negative consequences for some use cases, but will really only help if you use a query cache and if you get a high hit rate on that cache and if your code is written weirdly/broken to call setBoost on queries after you use them – which doesn't make sense to do at all if you are using a query cache.

          Consider again that REUSED_FILTER example i mentioned in my last comment – assuming the application is "well behaved", and doesn't call setBoost at add times: even w/o the implicit clone in BooleanQuery it should work great with a query cache enabled, and would use a lot less ram then with the implicit sub-query cloning in the BooleanQuery.Builder.

          But if an application does start trying to keep refrences to previously constructed Query instances, and call mutating methods (like setBoost) at runtime, then really they aren't going to be able to safely use the query cache at all – regardless of whether you have this implicit clone in BooleanQuery's builder.

          Show
          Hoss Man added a comment - Are you suggesting that this new builder should be backported to 5.x and that the setters on BooleanQuery should be marked as deprecated? yes. I did not consider this option because it would make BooleanQuery have the API of an immutable query while nothing would guarantee that it cannot be modified since we would need to keep the old deprecated setters. So I just made it a breaking change for 6.0. I think it would be good to get the API out there for 5.x users, so they can start using it. Even if the underlying BooleanQuery objects themselves are not immutable, encouraging people to use the Builder pattern to create them, and discouraging them from expecting to be able to mutate the BooleanQuery objects seems like a good idea to get out there as early as possible. I am willing to document the fact that we clone sub queries, but I am on the fence about removing it, since without defensively cloning, the BooleanQuery would still be mutable while this issue is about making sure it cannot change. There's a difference between saying the BooleanQuery is immutable and cannot change and saying the queries wrapped by the BooleanQuery are cloned & no longer accessible and cannot cahnge – that's something that wasn't clear from your issue description until I looked closer at your commit. By comparison, "Collections.singletonList(T o)" is documented to return an immutable list, but it doesn't clone 'o' – it doesn't try to prevent you from calling o.changeState() after you construct that list. For instance, consider the following sequence of operations: ... The change in run time behavior of code like that is exactly why this change scares me – it's very similar to what users may have come to expect from code like my example if they want to do something like call REUSED_FILTER.setBost(foo) to dynamically tweak relevancy of many queries (all at once) at runtime. One motivation behind this change is to be able to enable the query cache by default in IndexSearcher (currently off), which we can only do if queries can reliably be used as cache keys. I can appreciate that goal, but i don't think it's ever going to be feasible to turn that on by default in the truly generic case of any arbitrary lucene application, where people might have custom Query impls. Bottom line: i think there is a decent risk that this under the covers, no way to turn off, cloning of sub-queries will have some serious negative consequences for some use cases, but will really only help if you use a query cache and if you get a high hit rate on that cache and if your code is written weirdly/broken to call setBoost on queries after you use them – which doesn't make sense to do at all if you are using a query cache. Consider again that REUSED_FILTER example i mentioned in my last comment – assuming the application is "well behaved", and doesn't call setBoost at add times: even w/o the implicit clone in BooleanQuery it should work great with a query cache enabled, and would use a lot less ram then with the implicit sub-query cloning in the BooleanQuery.Builder. But if an application does start trying to keep refrences to previously constructed Query instances, and call mutating methods (like setBoost) at runtime, then really they aren't going to be able to safely use the query cache at all – regardless of whether you have this implicit clone in BooleanQuery's builder.
          Hide
          Yonik Seeley added a comment -

          but personally i think it's going to far and we shouldn't be making these defensive clones.

          +1
          Thanks for highlighting that - I hadn't realized it. I agree with the rest of your reasoning as well (I actually just deleted a lot of this post since it was redundant with what you said in your followup which I just saw)

          Show
          Yonik Seeley added a comment - but personally i think it's going to far and we shouldn't be making these defensive clones. +1 Thanks for highlighting that - I hadn't realized it. I agree with the rest of your reasoning as well (I actually just deleted a lot of this post since it was redundant with what you said in your followup which I just saw)
          Hide
          Adrien Grand added a comment -

          I can appreciate that goal, but i don't think it's ever going to be feasible to turn that on by default in the truly generic case of any arbitrary lucene application, where people might have custom Query impls.

          What makes custom query impls different?

          Consider again that REUSED_FILTER example i mentioned in my last comment – assuming the application is "well behaved", and doesn't call setBoost at add times: even w/o the implicit clone in BooleanQuery it should work great with a query cache enabled, and would use a lot less ram then with the implicit sub-query cloning in the BooleanQuery.Builder.

          I don't think "would use a lot less ram" is accurate: clone() is shallow so the main data-structures would still be shared with the clone. For instance if you consider TermsQuery which is in my experience the bad guy that can sometimes make keys (queries) use more memory than the values (doc id sets), clone() does not clone "termData", so between storing a single TermsQuery and storing a TermsQuery and its clone, there are only 24 bytes of RAM of difference (I just tested). Since the query cache typically needs to only cache few queries to be efficient, this would mean the difference would only be about a few kb.

          But if an application does start trying to keep refrences to previously constructed Query instances, and call mutating methods (like setBoost) at runtime, then really they aren't going to be able to safely use the query cache at all – regardless of whether you have this implicit clone in BooleanQuery's builder.

          A longer-term plan, once all our queries are fixed, is to upgrade clone()'s documentation to say that it has to return an independant instance. So there are two options: either deep cloning or shallow cloning and be immutable. By the way, this is where this issue arises from: we wanted to avoid having to deep-clone queries to use them as cache keys (see LUCENE-6369).

          Show
          Adrien Grand added a comment - I can appreciate that goal, but i don't think it's ever going to be feasible to turn that on by default in the truly generic case of any arbitrary lucene application, where people might have custom Query impls. What makes custom query impls different? Consider again that REUSED_FILTER example i mentioned in my last comment – assuming the application is "well behaved", and doesn't call setBoost at add times: even w/o the implicit clone in BooleanQuery it should work great with a query cache enabled, and would use a lot less ram then with the implicit sub-query cloning in the BooleanQuery.Builder. I don't think "would use a lot less ram" is accurate: clone() is shallow so the main data-structures would still be shared with the clone. For instance if you consider TermsQuery which is in my experience the bad guy that can sometimes make keys (queries) use more memory than the values (doc id sets), clone() does not clone "termData", so between storing a single TermsQuery and storing a TermsQuery and its clone, there are only 24 bytes of RAM of difference (I just tested). Since the query cache typically needs to only cache few queries to be efficient, this would mean the difference would only be about a few kb. But if an application does start trying to keep refrences to previously constructed Query instances, and call mutating methods (like setBoost) at runtime, then really they aren't going to be able to safely use the query cache at all – regardless of whether you have this implicit clone in BooleanQuery's builder. A longer-term plan, once all our queries are fixed, is to upgrade clone()'s documentation to say that it has to return an independant instance. So there are two options: either deep cloning or shallow cloning and be immutable. By the way, this is where this issue arises from: we wanted to avoid having to deep-clone queries to use them as cache keys (see LUCENE-6369 ).
          Hide
          ASF subversion and git services added a comment -

          Commit 1686408 from Adrien Grand in branch 'dev/trunk'
          [ https://svn.apache.org/r1686408 ]

          LUCENE-6570: Remove controversial defensive cloning of queries when adding them to a BooleanQuery.

          Show
          ASF subversion and git services added a comment - Commit 1686408 from Adrien Grand in branch 'dev/trunk' [ https://svn.apache.org/r1686408 ] LUCENE-6570 : Remove controversial defensive cloning of queries when adding them to a BooleanQuery.
          Hide
          Adrien Grand added a comment -

          I removed the controversial query cloning. Reopening to think about how we should handle the boost factor then. Options include cloning as part of createWeight since the query we are really interested in is the one returned by Weight.getQuery(), removing "void setBoost(boost)" in favour of something like "Query withBoost(boost)" or even remove the boost parameter from our queries and handle it externally.

          Show
          Adrien Grand added a comment - I removed the controversial query cloning. Reopening to think about how we should handle the boost factor then. Options include cloning as part of createWeight since the query we are really interested in is the one returned by Weight.getQuery(), removing "void setBoost(boost)" in favour of something like "Query withBoost(boost)" or even remove the boost parameter from our queries and handle it externally.
          Hide
          Yonik Seeley added a comment -

          Maybe we could satisfy people who want different things by having BooleanQuery be an abstract class.
          Things like defensive clones could be part of the default builder, but expert users could avoid the overhead of extra data structures introduced in LUCENE-6305 for example. It would give a way for well-behaved code that generates predictable queries to get maximum performance and minimum memory usage.

          Show
          Yonik Seeley added a comment - Maybe we could satisfy people who want different things by having BooleanQuery be an abstract class. Things like defensive clones could be part of the default builder, but expert users could avoid the overhead of extra data structures introduced in LUCENE-6305 for example. It would give a way for well-behaved code that generates predictable queries to get maximum performance and minimum memory usage.
          Hide
          ASF subversion and git services added a comment -

          Commit 1686440 from Adrien Grand in branch 'dev/trunk'
          [ https://svn.apache.org/r1686440 ]

          LUCENE-6570: Add a MIGRATE entry.

          Show
          ASF subversion and git services added a comment - Commit 1686440 from Adrien Grand in branch 'dev/trunk' [ https://svn.apache.org/r1686440 ] LUCENE-6570 : Add a MIGRATE entry.
          Hide
          Adrien Grand added a comment -

          I opened LUCENE-6590 as a follow-up for the boosting issue.

          I am re-closing now: I know I did not backport to 5.x but the API does not feel stable to me yet so I would like to avoid going back-and-forth between different ideas in 5.x releases. I suggest that we reconsider backporting the new API once we are more happy with itI...

          Show
          Adrien Grand added a comment - I opened LUCENE-6590 as a follow-up for the boosting issue. I am re-closing now: I know I did not backport to 5.x but the API does not feel stable to me yet so I would like to avoid going back-and-forth between different ideas in 5.x releases. I suggest that we reconsider backporting the new API once we are more happy with itI...
          Hide
          ASF subversion and git services added a comment -

          Commit 1693189 from Adrien Grand in branch 'dev/trunk'
          [ https://svn.apache.org/r1693189 ]

          LUCENE-6570: Move changes entry as the change was backported to 5.3.

          Show
          ASF subversion and git services added a comment - Commit 1693189 from Adrien Grand in branch 'dev/trunk' [ https://svn.apache.org/r1693189 ] LUCENE-6570 : Move changes entry as the change was backported to 5.3.
          Hide
          ASF subversion and git services added a comment -

          Commit 1693190 from Adrien Grand in branch 'dev/branches/branch_5x'
          [ https://svn.apache.org/r1693190 ]

          LUCENE-6570: Make BooleanQuery immutable.

          Show
          ASF subversion and git services added a comment - Commit 1693190 from Adrien Grand in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1693190 ] LUCENE-6570 : Make BooleanQuery immutable.
          Hide
          Greg Huber added a comment - - edited

          Hello,

          I am extending AnalyzingInfixSuggester for use with my suggester where I change the query to be a AND rather than an OR in the finishQuery(..) method.

          ie

          /**

          • Subclass can override this to tweak the Query before searching.
            */
            protected Query finishQuery(Builder in, boolean allTermsRequired) {

          // Update contexts to be ANDs (MUST) rather than ORs (SHOULD)
          for (BooleanClause booleanClause : in.build().clauses()) {
          // Change the contexts to be MUST (will be the only BooleanQuery and others will be TermQuery)
          if (booleanClause.getQuery() instanceof BooleanQuery) {
          BooleanQuery bq = (BooleanQuery) booleanClause.getQuery();
          for (BooleanClause bc : bq)

          { bc.setOccur(BooleanClause.Occur.MUST); }

          // We are done
          break;
          }
          }

          return in.build();
          }

          It says that BooleanClause.setOccur(..) is depreciated and will be immutable in 6.0, how would I then be able to do this?

          Cheers Greg

          Show
          Greg Huber added a comment - - edited Hello, I am extending AnalyzingInfixSuggester for use with my suggester where I change the query to be a AND rather than an OR in the finishQuery(..) method. ie /** Subclass can override this to tweak the Query before searching. */ protected Query finishQuery(Builder in, boolean allTermsRequired) { // Update contexts to be ANDs (MUST) rather than ORs (SHOULD) for (BooleanClause booleanClause : in.build().clauses()) { // Change the contexts to be MUST (will be the only BooleanQuery and others will be TermQuery) if (booleanClause.getQuery() instanceof BooleanQuery) { BooleanQuery bq = (BooleanQuery) booleanClause.getQuery(); for (BooleanClause bc : bq) { bc.setOccur(BooleanClause.Occur.MUST); } // We are done break; } } return in.build(); } It says that BooleanClause.setOccur(..) is depreciated and will be immutable in 6.0, how would I then be able to do this? Cheers Greg
          Hide
          Uwe Schindler added a comment - - edited

          The boolean clauses have to be created with MUST from the beginning.

          Show
          Uwe Schindler added a comment - - edited The boolean clauses have to be created with MUST from the beginning.
          Hide
          Greg Huber added a comment -

          btw, I have multiple contexts so call

          AnalyzingInfixSuggester.suggester.lookup(term, contexts, nMax, true, true);

          which will then call AnalyzingInfixSuggester.toQuery(..) eventually, which adds the context with the BooleanClause.Occur.SHOULD. Its a private method so is there a way to override this?

          private BooleanQuery toQuery(Set<BytesRef> contextInfo) {
          if (contextInfo == null || contextInfo.isEmpty())

          { return null; }

          BooleanQuery.Builder contextFilter = new BooleanQuery.Builder();
          for (BytesRef context : contextInfo)

          { addContextToQuery(contextFilter, context, BooleanClause.Occur.SHOULD); }

          return contextFilter.build();
          }

          Show
          Greg Huber added a comment - btw, I have multiple contexts so call AnalyzingInfixSuggester.suggester.lookup(term, contexts, nMax, true, true); which will then call AnalyzingInfixSuggester.toQuery(..) eventually, which adds the context with the BooleanClause.Occur.SHOULD. Its a private method so is there a way to override this? private BooleanQuery toQuery(Set<BytesRef> contextInfo) { if (contextInfo == null || contextInfo.isEmpty()) { return null; } BooleanQuery.Builder contextFilter = new BooleanQuery.Builder(); for (BytesRef context : contextInfo) { addContextToQuery(contextFilter, context, BooleanClause.Occur.SHOULD); } return contextFilter.build(); }
          Hide
          Adrien Grand added a comment -

          Maybe you can override `addContextToQuery` to override the clause?

          Show
          Adrien Grand added a comment - Maybe you can override `addContextToQuery` to override the clause?
          Hide
          Greg Huber added a comment -

          I guess it will work as its public, not much else I can do!

          Cheers Greg.

          Show
          Greg Huber added a comment - I guess it will work as its public, not much else I can do! Cheers Greg.
          Hide
          Shalin Shekhar Mangar added a comment -

          Bulk close for 5.3.0 release

          Show
          Shalin Shekhar Mangar added a comment - Bulk close for 5.3.0 release

            People

            • Assignee:
              Adrien Grand
              Reporter:
              Adrien Grand
            • Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development