Details

    • Type: Improvement Improvement
    • Status: Open
    • Priority: Minor Minor
    • Resolution: Unresolved
    • Affects Version/s: 4.0-ALPHA
    • Fix Version/s: 4.9, 5.0
    • Component/s: core/search
    • Labels:
      None
    • Lucene Fields:
      New, Patch Available

      Description

      Out of the discussion in LUCENE-2868, it could be useful to add a generic Query Visitor / Walker that could be used for more advanced rewriting, optimizations or anything that requires state to be stored as each Query is visited.

      We could keep the interface very simple:

      public interface QueryVisitor {
        Query visit(Query query);
      }
      

      and then use a reflection based visitor like Earwin suggested, which would allow implementators to provide visit methods for just Querys that they are interested in.

      1. LUCENE-3041.patch
        15 kB
        Chris Male
      2. LUCENE-3041.patch
        48 kB
        Chris Male
      3. LUCENE-3041.patch
        51 kB
        Chris Male
      4. LUCENE-3041.patch
        14 kB
        Chris Male
      5. LUCENE-3041.patch
        14 kB
        Chris Male

        Activity

        Hide
        Uwe Schindler added a comment -

        Move issue to Lucene 4.9.

        Show
        Uwe Schindler added a comment - Move issue to Lucene 4.9.
        Hide
        Uwe Schindler added a comment -

        Hi I found a bug in the InvocationDispatcher when used with Java 7: the manual ordering of methods to visit is broken, because Class#getDeclaredMethods() returns the methods in random order (on same JVM, too). The ordering is not enough to prevent choosing the wrong method, so AmbiguousMethodException occurs ("Multiple methods resolved for parameter type, cannot disambiguate"). In the current code, because older JVMs return methods in declaration order and as Collections.sort() is stable, the methods declared earlier take precendence. But this does no longer work with Java 7.

        If I have time, I will upload a fix I found for that (I have one locally in another package, because this visitor was used for some internal code). The new approach groups the visit methods by parameter types with same superclass and also sorts by class hierarchy distance (more far superclasses get lower score than direct superclasses). The approch uses the lowest distance from parameter type to method type.

        This makes InvocationDispatcher randomization proof (just add Collections.sort() in the InvocationDispatcher ctor to test...)

        Show
        Uwe Schindler added a comment - Hi I found a bug in the InvocationDispatcher when used with Java 7: the manual ordering of methods to visit is broken, because Class#getDeclaredMethods() returns the methods in random order (on same JVM, too). The ordering is not enough to prevent choosing the wrong method, so AmbiguousMethodException occurs ("Multiple methods resolved for parameter type, cannot disambiguate"). In the current code, because older JVMs return methods in declaration order and as Collections.sort() is stable, the methods declared earlier take precendence. But this does no longer work with Java 7. If I have time, I will upload a fix I found for that (I have one locally in another package, because this visitor was used for some internal code). The new approach groups the visit methods by parameter types with same superclass and also sorts by class hierarchy distance (more far superclasses get lower score than direct superclasses). The approch uses the lowest distance from parameter type to method type. This makes InvocationDispatcher randomization proof (just add Collections.sort() in the InvocationDispatcher ctor to test...)
        Hide
        Steve Rowe added a comment -

        Bulk move 4.4 issues to 4.5 and 5.0

        Show
        Steve Rowe added a comment - Bulk move 4.4 issues to 4.5 and 5.0
        Hide
        Simon Willnauer added a comment -

        Yes I definitely am going to revisit this. I'm still unsure how best to incorporate this into IndexSearcher so that it can work per-segment. But we can commit the functionality and then deal with the integration separately.

        actually this already works per segment in a way it should. If you look at rewriting this is done based on the IS anyway for a query so we only rewrite once. If you are inside a MTQ and you want to do the rewrite caching or other things you can do this based on the IS. The question is if we should really commit this without any specific usecase or other code using it :/

        are you going to address earwins issue still?

        Show
        Simon Willnauer added a comment - Yes I definitely am going to revisit this. I'm still unsure how best to incorporate this into IndexSearcher so that it can work per-segment. But we can commit the functionality and then deal with the integration separately. actually this already works per segment in a way it should. If you look at rewriting this is done based on the IS anyway for a query so we only rewrite once. If you are inside a MTQ and you want to do the rewrite caching or other things you can do this based on the IS. The question is if we should really commit this without any specific usecase or other code using it :/ are you going to address earwins issue still?
        Hide
        Chris Male added a comment -

        Simon,

        Yes I definitely am going to revisit this. I'm still unsure how best to incorporate this into IndexSearcher so that it can work per-segment. But we can commit the functionality and then deal with the integration separately.

        What plans do you have?

        Show
        Chris Male added a comment - Simon, Yes I definitely am going to revisit this. I'm still unsure how best to incorporate this into IndexSearcher so that it can work per-segment. But we can commit the functionality and then deal with the integration separately. What plans do you have?
        Hide
        Simon Willnauer added a comment -

        chris are you going to revisite this? I think we can make good use of this though.

        simon

        Show
        Simon Willnauer added a comment - chris are you going to revisite this? I think we can make good use of this though. simon
        Hide
        Earwin Burrfoot added a comment - - edited

        The static cache is now not threadsafe.
        And original had nice diagnostics for ambigous dispatches. Why not just take it and cut over to JDK reflection and CHM?
        Same can be said for tests.

        What about throwing original invocation exception instead of the wrapper? Since we're emulating a language feature, a simple method call, it's logical to only throw custom exceptions in .. well .. exceptional cases, like ambiguity/no matching method. If client code throws Errors/RuntimeExceptions, they should be transparently rethrown.

        Show
        Earwin Burrfoot added a comment - - edited The static cache is now not threadsafe. And original had nice diagnostics for ambigous dispatches. Why not just take it and cut over to JDK reflection and CHM? Same can be said for tests. What about throwing original invocation exception instead of the wrapper? Since we're emulating a language feature, a simple method call, it's logical to only throw custom exceptions in .. well .. exceptional cases, like ambiguity/no matching method. If client code throws Errors/RuntimeExceptions, they should be transparently rethrown.
        Hide
        Chris Male added a comment -

        Are you going to add the IS changes here too?

        Yup, I'm just working through the best way to expose the API in the IS while supporting per segment walking. I'll have something together in the next day or two.

        I wonder if we could move the MethodDispatchException into InvocationDispatcher as a nested class

        Good call. I'll make the change and upload something immediately.

        Show
        Chris Male added a comment - Are you going to add the IS changes here too? Yup, I'm just working through the best way to expose the API in the IS while supporting per segment walking. I'll have something together in the next day or two. I wonder if we could move the MethodDispatchException into InvocationDispatcher as a nested class Good call. I'll make the change and upload something immediately.
        Hide
        Simon Willnauer added a comment -

        New patch that implements what I said in the previous comments (except for the IS changes).

        Chris, patch looks good! Are you going to add the IS changes here too? I wonder if we could move the MethodDispatchException into InvocationDispatcher as a nested class I don't think we need an extra file for this class.

        Show
        Simon Willnauer added a comment - New patch that implements what I said in the previous comments (except for the IS changes). Chris, patch looks good! Are you going to add the IS changes here too? I wonder if we could move the MethodDispatchException into InvocationDispatcher as a nested class I don't think we need an extra file for this class.
        Hide
        Chris Male added a comment -

        New patch that implements what I said in the previous comments (except for the IS changes).

        Also a test is now included.

        Show
        Chris Male added a comment - New patch that implements what I said in the previous comments (except for the IS changes). Also a test is now included.
        Hide
        Chris Male added a comment -

        To follow up on Earwin's comments, I'm going to do the following:

        • Leave Query#rewrite out of the walking process. As Earwin said, rewrite provides vital query optimization / conversion to primitive runnable queries. Having this method on Query is a good idea since user Queries can simply implement this method and move on.
        • In a separate issue, add a RewriteState like concept which can be used for caching rewrites like that suggested by Simon. This will have a considerable performance improvement for people doing lots of repeated FuzzyQuerys for example.
        • Change my processing concept into a generic Walker<I, O> system, which can be used for lots of things in Lucene. Users can implement this Walker to do whatever they want (maybe we can pry Earwin's walker based highlighter from him? )
        • Overload IndexSearcher's methods to support passing in a Walker. We need this, instead of simply having the Walker external, because we really want to support per-segment Walking.

        I'll make a patch for the stuff related to this issue shortly, and spin off the RewriteState stuff.

        Show
        Chris Male added a comment - To follow up on Earwin's comments, I'm going to do the following: Leave Query#rewrite out of the walking process. As Earwin said, rewrite provides vital query optimization / conversion to primitive runnable queries. Having this method on Query is a good idea since user Queries can simply implement this method and move on. In a separate issue, add a RewriteState like concept which can be used for caching rewrites like that suggested by Simon. This will have a considerable performance improvement for people doing lots of repeated FuzzyQuerys for example. Change my processing concept into a generic Walker<I, O> system, which can be used for lots of things in Lucene. Users can implement this Walker to do whatever they want (maybe we can pry Earwin's walker based highlighter from him? ) Overload IndexSearcher's methods to support passing in a Walker. We need this, instead of simply having the Walker external, because we really want to support per-segment Walking. I'll make a patch for the stuff related to this issue shortly, and spin off the RewriteState stuff.
        Hide
        Earwin Burrfoot added a comment -

        I vehemently oppose introducing the "visitor design pattern" (classic double-dispatch version) into the Query API. It is a badly broken replacement (ie, cannot be easily extended) for multiple dispatch.

        Also, from the looks of it (short IRC discussion), user-written visitors and rewrite() API have totally different aims.

        • rewrite() is very specific (it is a pre-search preparation that produces runnable query, eg expands multi-term queries into OR sequences or wrapped filters), but should work over any kinds of user-written Queries with possibly exotic behaviours (eg, take rewrite from the cache). Consequently, the logic is tightly coupled to each Query-impl innards.
        • user-written visitors on the other hand, may have a multitude of purporses (wildly varying logic for node handling + navigation - eg, some may want to see MTQs expanded, and some may not) over relatively fixed number of possible node types.

        So the best possible solution so far is to keep rewrite() asis - it serves its purporse quite well.
        And introduce generic reflection-based multiple-dispatch visitor that can walk any kind of hierarchies (eg, in my project I rewrite ASTs to ASTs, ASTs to Queries, and Queries to bags of Terms) so people can transform their query trees.
        The current patch contains a derivative of my original version. And here's a test/example
        This visitor keeps all logic on itself and thus cannot replace rewrite().

        Show
        Earwin Burrfoot added a comment - I vehemently oppose introducing the "visitor design pattern" (classic double-dispatch version) into the Query API. It is a badly broken replacement (ie, cannot be easily extended) for multiple dispatch. Also, from the looks of it (short IRC discussion), user-written visitors and rewrite() API have totally different aims. rewrite() is very specific (it is a pre-search preparation that produces runnable query, eg expands multi-term queries into OR sequences or wrapped filters), but should work over any kinds of user-written Queries with possibly exotic behaviours (eg, take rewrite from the cache). Consequently, the logic is tightly coupled to each Query-impl innards. user-written visitors on the other hand, may have a multitude of purporses (wildly varying logic for node handling + navigation - eg, some may want to see MTQs expanded, and some may not) over relatively fixed number of possible node types. So the best possible solution so far is to keep rewrite() asis - it serves its purporse quite well. And introduce generic reflection-based multiple-dispatch visitor that can walk any kind of hierarchies (eg, in my project I rewrite ASTs to ASTs, ASTs to Queries, and Queries to bags of Terms) so people can transform their query trees. The current patch contains a derivative of my original version . And here's a test/example This visitor keeps all logic on itself and thus cannot replace rewrite().
        Hide
        David Smiley added a comment -

        Yes! I enthusiastically support introducing the visitor design pattern into the Query api. I've polled the community on this before and got positive responses from a few committers but I haven't yet had the time to do anything. It's great to see you've gotten the ball rolling Chris.

        I haven't looked at your patch yet. Query.rewrite() is definitely a candidate for reworking in terms of this new pattern.

        Show
        David Smiley added a comment - Yes! I enthusiastically support introducing the visitor design pattern into the Query api. I've polled the community on this before and got positive responses from a few committers but I haven't yet had the time to do anything. It's great to see you've gotten the ball rolling Chris. I haven't looked at your patch yet. Query.rewrite() is definitely a candidate for reworking in terms of this new pattern.
        Hide
        Chris Male added a comment -

        This is an excellent opportunity to redefine Queries as immutable

        What do you envisage this involving? Although not required by the API, most rewriting implementations make a new Query and add changes there, leaving themselves untouched. Are you wanting to require this somehow?

        Show
        Chris Male added a comment - This is an excellent opportunity to redefine Queries as immutable What do you envisage this involving? Although not required by the API, most rewriting implementations make a new Query and add changes there, leaving themselves untouched. Are you wanting to require this somehow?
        Hide
        Lance Norskog added a comment -

        This is an excellent opportunity to redefine Queries as immutable, which would make query rewriting an order of magnitude safer.

        Show
        Lance Norskog added a comment - This is an excellent opportunity to redefine Queries as immutable, which would make query rewriting an order of magnitude safer.
        Hide
        Chris Male added a comment -

        Updated patch which removes the stupid test I'd included

        Show
        Chris Male added a comment - Updated patch which removes the stupid test I'd included
        Hide
        Chris Male added a comment -

        A much larger patch that implements full query AST walking.

        The problem with having the QueryProcessor fully external to Query#rewrite, is that composite Querys would need to expose their children. This is a little messy and could be hard with more exotic user-made Querys.

        So this patch basically expands Query#rewrite to include the QueryProcessor. Composite queries can then pass their children to the processor during their rewrite.

        For backwards compat, and simplicity, I've created a SimpleQueryProcessor which directly calls rewrite. This means casual users do not need to concern themselves with processing.

        Overtime we can expose the QueryProcessor API through IndexSearcher and other situations.

        Show
        Chris Male added a comment - A much larger patch that implements full query AST walking. The problem with having the QueryProcessor fully external to Query#rewrite, is that composite Querys would need to expose their children. This is a little messy and could be hard with more exotic user-made Querys. So this patch basically expands Query#rewrite to include the QueryProcessor. Composite queries can then pass their children to the processor during their rewrite. For backwards compat, and simplicity, I've created a SimpleQueryProcessor which directly calls rewrite. This means casual users do not need to concern themselves with processing. Overtime we can expose the QueryProcessor API through IndexSearcher and other situations.
        Hide
        Chris Male added a comment -

        No, you didn't miss something. The RewriteCachingQueryProcessor currently only rewrites the top level query. It needs to be extended to handle BooleanQuerys and any other composite query (BoostingQuery for example). I might actually add a DefaultQueryProcessor again which walks the full Query AST by default. Then get RewritingCachingQueryProcessor to extend and cache.

        I'll iterate a new patch.

        Show
        Chris Male added a comment - No, you didn't miss something. The RewriteCachingQueryProcessor currently only rewrites the top level query. It needs to be extended to handle BooleanQuerys and any other composite query (BoostingQuery for example). I might actually add a DefaultQueryProcessor again which walks the full Query AST by default. Then get RewritingCachingQueryProcessor to extend and cache. I'll iterate a new patch.
        Hide
        Simon Willnauer added a comment -

        Chris, nice simplification. I have one question, lets say we have a boolean query OR(AND(Fuzzy:A, Fuzzy:B), AND(Fuzzy A, Fuzzy:C)) how would it be possible with the current patch to reuse the rewrite for Fuzzy:A? As far as I can see If I don't rewrite the boolean query myself the current patch will rewrite the top level query and returns right? So somehow it must be possible to walk down the query ast.

        or do I miss something?

        Show
        Simon Willnauer added a comment - Chris, nice simplification. I have one question, lets say we have a boolean query OR(AND(Fuzzy:A, Fuzzy:B), AND(Fuzzy A, Fuzzy:C)) how would it be possible with the current patch to reuse the rewrite for Fuzzy:A? As far as I can see If I don't rewrite the boolean query myself the current patch will rewrite the top level query and returns right? So somehow it must be possible to walk down the query ast. or do I miss something?
        Hide
        Chris Male added a comment -

        Updated patch.

        This simplifies the hierarchy a lot. DispatchingQueryProcessor is merged into QueryProcessor, which then becomes an abstract class. QueryProcessor now has #dispatchProcessing(Query) which is the entry point to the dispatching process.

        DefaultQueryProcessor is changed to RewriteCachingQueryProcessor which caches the rewriting of querys. This could be extended further to provide special support for BooleanQuery.

        Remaining to do is to provide a test which illustrates walking through a complex class.

        Show
        Chris Male added a comment - Updated patch. This simplifies the hierarchy a lot. DispatchingQueryProcessor is merged into QueryProcessor, which then becomes an abstract class. QueryProcessor now has #dispatchProcessing(Query) which is the entry point to the dispatching process. DefaultQueryProcessor is changed to RewriteCachingQueryProcessor which caches the rewriting of querys. This could be extended further to provide special support for BooleanQuery. Remaining to do is to provide a test which illustrates walking through a complex class.
        Hide
        Simon Willnauer added a comment -

        I don't quite follow you. Currently DispatchingQueryProcessor caches InvocationDispatchers by concrete impl type. So we only create a new InvocationDispatcher when we have a new implementation (which means InvocationDispatchers are shared between segments, searches, everything). In that regard DispatchingQueryProcessor#dispatcherByClass should be a ConcurrentHashMap. But otherwise, I think we're okay?

        my bad.. I didn't look close enough Yet, I was proposing something like what you did though the problem here could be that it is static but for now ConcurrentHashMap would do.

        Any thoughts on how to avoid that?

        hmm, I think we should try to dispatch first. If there is not specialized method to dispatch we should rewrite and continue.

        I still wonder how this would walk down the query tree so I am happily waiting for the next patch.

        Show
        Simon Willnauer added a comment - I don't quite follow you. Currently DispatchingQueryProcessor caches InvocationDispatchers by concrete impl type. So we only create a new InvocationDispatcher when we have a new implementation (which means InvocationDispatchers are shared between segments, searches, everything). In that regard DispatchingQueryProcessor#dispatcherByClass should be a ConcurrentHashMap. But otherwise, I think we're okay? my bad.. I didn't look close enough Yet, I was proposing something like what you did though the problem here could be that it is static but for now ConcurrentHashMap would do. Any thoughts on how to avoid that? hmm, I think we should try to dispatch first. If there is not specialized method to dispatch we should rewrite and continue. I still wonder how this would walk down the query tree so I am happily waiting for the next patch.
        Hide
        Chris Male added a comment -

        Regarding sharing the map, I think you should use a prototype pattern that creates a new Processor from an existing one maybe via clone()? In the InvocationDispatcher case we should maybe use a concurrent hash map and share the map across instances for the same dispatcher class.

        I don't quite follow you. Currently DispatchingQueryProcessor caches InvocationDispatchers by concrete impl type. So we only create a new InvocationDispatcher when we have a new implementation (which means InvocationDispatchers are shared between segments, searches, everything). In that regard DispatchingQueryProcessor#dispatcherByClass should be a ConcurrentHashMap. But otherwise, I think we're okay?

        The process implementation in DefaultQueryProcessor executes query.rewrite before passing the query to the dispatcher which is no good since some QueryProcessor impls might not want to rewrite that query at all. In LUCENE-2868 karl tries to find a way to prevent lucene to rewrite one and the same FuzzyQuery since he has them in multiple clauses somewhere down the BQ tree. This is a super expensive operation in his case to rewriting it only once makes sense. I think this should be left to the actual implementation.

        This is super tricky. The question is how to define a base case in #process(Query). Lets assume DefaultQueryProcessor#process(Query) just dispatched immediately. It might be a receiver of the same dispatch (perhaps the query is a TermQuery and no #process(TermQuery) is provided, so #process(Query) is chosen). It then just dispatches again, receives again.. and we're in a loop.

        Any thoughts on how to avoid that?

        Show
        Chris Male added a comment - Regarding sharing the map, I think you should use a prototype pattern that creates a new Processor from an existing one maybe via clone()? In the InvocationDispatcher case we should maybe use a concurrent hash map and share the map across instances for the same dispatcher class. I don't quite follow you. Currently DispatchingQueryProcessor caches InvocationDispatchers by concrete impl type. So we only create a new InvocationDispatcher when we have a new implementation (which means InvocationDispatchers are shared between segments, searches, everything). In that regard DispatchingQueryProcessor#dispatcherByClass should be a ConcurrentHashMap. But otherwise, I think we're okay? The process implementation in DefaultQueryProcessor executes query.rewrite before passing the query to the dispatcher which is no good since some QueryProcessor impls might not want to rewrite that query at all. In LUCENE-2868 karl tries to find a way to prevent lucene to rewrite one and the same FuzzyQuery since he has them in multiple clauses somewhere down the BQ tree. This is a super expensive operation in his case to rewriting it only once makes sense. I think this should be left to the actual implementation. This is super tricky. The question is how to define a base case in #process(Query). Lets assume DefaultQueryProcessor#process(Query) just dispatched immediately. It might be a receiver of the same dispatch (perhaps the query is a TermQuery and no #process(TermQuery) is provided, so #process(Query) is chosen). It then just dispatches again, receives again.. and we're in a loop. Any thoughts on how to avoid that?
        Hide
        Simon Willnauer added a comment -

        Hey chris,

        I have some comments on your patch:

        • s/visitorClass/processorClass/
        • s/visitor/processor/
        • in InvocationDispatcher you might wanna check if the single parameter is Query subclass
        • I am worried about checking which method to dispatch for every query type once per segment as well as that we create an processor instance per segment and not per search. There are actually two problems here. 1. We create a new instance per segment. 2. We can not share the dispatch map, not even across segments. I think we should create one instance per search and pass the IR down together with the query or even better follow the pattern that Collector et al. uses and pass it with a setReader(IR) method. that way we also have a clear way how to tell the processor that we just moved on to a new segment. Regarding sharing the map, I think you should use a prototype pattern that creates a new Processor from an existing one maybe via clone()? In the InvocationDispatcher case we should maybe use a concurrent hash map and share the map across instances for the same dispatcher class.
        • The process implementation in DefaultQueryProcessor executes query.rewrite before passing the query to the
          dispatcher which is no good since some QueryProcessor impls might not want to rewrite that query at all. In LUCENE-2868 karl tries to find a way to prevent lucene to rewrite one and the same FuzzyQuery since he has them in multiple clauses somewhere down the BQ tree. This is a super expensive operation in his case to rewriting it only once makes sense. I think this should be left to the actual implementation.
        Show
        Simon Willnauer added a comment - Hey chris, I have some comments on your patch: s/visitorClass/processorClass/ s/visitor/processor/ in InvocationDispatcher you might wanna check if the single parameter is Query subclass I am worried about checking which method to dispatch for every query type once per segment as well as that we create an processor instance per segment and not per search. There are actually two problems here. 1. We create a new instance per segment. 2. We can not share the dispatch map, not even across segments. I think we should create one instance per search and pass the IR down together with the query or even better follow the pattern that Collector et al. uses and pass it with a setReader(IR) method. that way we also have a clear way how to tell the processor that we just moved on to a new segment. Regarding sharing the map, I think you should use a prototype pattern that creates a new Processor from an existing one maybe via clone()? In the InvocationDispatcher case we should maybe use a concurrent hash map and share the map across instances for the same dispatcher class. The process implementation in DefaultQueryProcessor executes query.rewrite before passing the query to the dispatcher which is no good since some QueryProcessor impls might not want to rewrite that query at all. In LUCENE-2868 karl tries to find a way to prevent lucene to rewrite one and the same FuzzyQuery since he has them in multiple clauses somewhere down the BQ tree. This is a super expensive operation in his case to rewriting it only once makes sense. I think this should be left to the actual implementation.
        Hide
        Chris Male added a comment -

        Working patch, minus tests at the moment.

        Introduces the QueryProcessor interface like that described above.

        Also introduces a generic InvocationDispatcher that can be used for visitor like multi-dispatch.

        Show
        Chris Male added a comment - Working patch, minus tests at the moment. Introduces the QueryProcessor interface like that described above. Also introduces a generic InvocationDispatcher that can be used for visitor like multi-dispatch.
        Hide
        Chris Male added a comment -

        Yup I have a patch cooking.

        Stuff like automatic dispatch for certain query types might need some cglib magic or at least req. java 6 to perform so they might need to go to contrib/misc.

        I don't think this will be the case. I am striving to use Java 5 reflection classes and that seems to be working fine.

        Show
        Chris Male added a comment - Yup I have a patch cooking. Stuff like automatic dispatch for certain query types might need some cglib magic or at least req. java 6 to perform so they might need to go to contrib/misc. I don't think this will be the case. I am striving to use Java 5 reflection classes and that seems to be working fine.
        Hide
        Simon Willnauer added a comment -

        I'm happy to settle with QueryProcessor#process

        +1 - Chris are you cranking out a patch for this?

        I think if we have a QueryProcessor we should somehow make it possible to optionally hook it into IndexSearcher to essentially replace the direct call to Query#rewrite
        Eventually it should be the QueryProcessor's responsibility to rewrite the query and pass the actual 'primitive' query to the searcher once done. I think its good to keep that interface super lean and let more fancy impl. follow up on it. Stuff like automatic dispatch for certain query types might need some cglib magic or at least req. java 6 to perform so they might need to go to contrib/misc.

        Show
        Simon Willnauer added a comment - I'm happy to settle with QueryProcessor#process +1 - Chris are you cranking out a patch for this? I think if we have a QueryProcessor we should somehow make it possible to optionally hook it into IndexSearcher to essentially replace the direct call to Query#rewrite Eventually it should be the QueryProcessor's responsibility to rewrite the query and pass the actual 'primitive' query to the searcher once done. I think its good to keep that interface super lean and let more fancy impl. follow up on it. Stuff like automatic dispatch for certain query types might need some cglib magic or at least req. java 6 to perform so they might need to go to contrib/misc.
        Hide
        Chris Male added a comment -

        I'm happy to settle with QueryProcessor#process

        Show
        Chris Male added a comment - I'm happy to settle with QueryProcessor#process
        Hide
        Simon Willnauer added a comment -

        What about QueryOptimizer?

        QueryProcessor or QueryPreProcessor?

        Show
        Simon Willnauer added a comment - What about QueryOptimizer? QueryProcessor or QueryPreProcessor?
        Hide
        Chris Male added a comment -

        I remain weary of calling it QueryRewriter since there is already Query rewriting support through Query#rewrite, but I take your point. What about QueryOptimizer?

        Show
        Chris Male added a comment - I remain weary of calling it QueryRewriter since there is already Query rewriting support through Query#rewrite, but I take your point. What about QueryOptimizer?
        Hide
        Simon Willnauer added a comment -

        I like the simple interface but the name is somewhat misleading here I think. Either we make this a 'real' visitor pattern and add accept methods to Query which I don't think is necessary or we should make the name specific for the task. Since this is really for walking the Query 'AST' during the rewrite process we should make this very clean in the IF name. QueryRewriter or something like that would make more sense and it would justify the Query return value, no?

        Show
        Simon Willnauer added a comment - I like the simple interface but the name is somewhat misleading here I think. Either we make this a 'real' visitor pattern and add accept methods to Query which I don't think is necessary or we should make the name specific for the task. Since this is really for walking the Query 'AST' during the rewrite process we should make this very clean in the IF name. QueryRewriter or something like that would make more sense and it would justify the Query return value, no?

          People

          • Assignee:
            Simon Willnauer
            Reporter:
            Chris Male
          • Votes:
            2 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:

              Development