Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-1061

Adding a factory to QueryParser to instantiate query instances

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.3
    • Fix Version/s: 2.4
    • Component/s: core/queryparser
    • Labels:
      None
    • Lucene Fields:
      New, Patch Available

      Description

      With the new efforts with Payload and scoring functions, it would be nice to plugin custom query implementations while using the same QueryParser.
      Included is a patch with some refactoring the QueryParser to take a factory that produces query instances.

      1. lucene_patch.txt
        6 kB
        John Wang
      2. LUCENE-1061.patch
        16 kB
        Michael McCandless
      3. LUCENE-1061.patch
        16 kB
        Michael McCandless

        Issue Links

          Activity

          Hide
          john.wang@gmail.com John Wang added a comment -

          This patch introduces a new file: QueryBuilder which is just a factory for instantiating query objects.

          And the class QueryParser is modified to use use the factory to build the final query.

          This is backward compatible.

          Show
          john.wang@gmail.com John Wang added a comment - This patch introduces a new file: QueryBuilder which is just a factory for instantiating query objects. And the class QueryParser is modified to use use the factory to build the final query. This is backward compatible.
          Hide
          michaelbusch Michael Busch added a comment -

          Yonik,

          I remember that we talked briefly about a QueryFactory in Atlanta and
          you had some cool ideas. Maybe you could mention them here?

          Show
          michaelbusch Michael Busch added a comment - Yonik, I remember that we talked briefly about a QueryFactory in Atlanta and you had some cool ideas. Maybe you could mention them here?
          Hide
          mikemccand Michael McCandless added a comment -

          This looks useful – Michael or Yonik, what do you think?

          Show
          mikemccand Michael McCandless added a comment - This looks useful – Michael or Yonik, what do you think?
          Hide
          ehatcher Erik Hatcher added a comment -

          What's wrong with just subclassing QueryParser and overriding the desired methods? Either way someone wanting to provide custom Query implementations will be writing effectively the same code, just with more indirection with this method.

          Show
          ehatcher Erik Hatcher added a comment - What's wrong with just subclassing QueryParser and overriding the desired methods? Either way someone wanting to provide custom Query implementations will be writing effectively the same code, just with more indirection with this method.
          Hide
          mikemccand Michael McCandless added a comment -

          What's wrong with just subclassing QueryParser and overriding the desired methods?

          I think subclassing would work fine too, as long as we fix QueryParser so that instead of doing things like:

            TermQuery q = new TermQuery(...);
          

          it would do:

            q = newTermQuery(...);
          

          and then define a newTermQuery(...) method which a subclass could then override. Is that what you have in mind Erik? John does this sound OK?

          Show
          mikemccand Michael McCandless added a comment - What's wrong with just subclassing QueryParser and overriding the desired methods? I think subclassing would work fine too, as long as we fix QueryParser so that instead of doing things like: TermQuery q = new TermQuery(...); it would do: q = newTermQuery(...); and then define a newTermQuery(...) method which a subclass could then override. Is that what you have in mind Erik? John does this sound OK?
          Hide
          creamyg Marvin Humphrey added a comment -

          > What's wrong with just subclassing QueryParser and overriding the desired methods?

          For what it's worth, I find the idea of a separate class appealing from an API documentation design standpoint. This usage of QueryParser is relatively arcane. Hiding all these methods away in a separate class means less clutter and less effort for most people scanning through the API docs of an important, commonly used class.

          Show
          creamyg Marvin Humphrey added a comment - > What's wrong with just subclassing QueryParser and overriding the desired methods? For what it's worth, I find the idea of a separate class appealing from an API documentation design standpoint. This usage of QueryParser is relatively arcane. Hiding all these methods away in a separate class means less clutter and less effort for most people scanning through the API docs of an important, commonly used class.
          Hide
          shaie Shai Erera added a comment -

          I think the idea of sub-classing makes more sense when you're talking about a QueryParser. The parser includes, mainly, two logics - parsing a query into query "segments" or "nodes" and then creating the appropriate Lucene Query instance for each sengment/node.
          By modifying QueryParser so that it uses protected callback methods for creating the appropriate Query instance for different node types, you get a very clean and elegant query parsing solution.
          All sub-classes can share the same parsing rules (which generally don't change), and just instantiating their own TermQuery or PrefixQuery instances.

          Wouldn't you need to do that anyway with the Query factory? I mean, what if I want to create all the query types like the default implementation, but change the PrefixQuery instance to something I wrote? Wouldn't I need to extend the default query factory? Sounds exactly like extending QueryParser to me.

          I recently implemented a query parser for a different query syntax than Lucene's, but I still wanted to create Lucene Query objects. The parser includes different protected callback methods like getTermQuery, getPhraseQuery, getPrefixQuery etc, providing a default implementation to each of course. It parses the query into "nodes" and then invoke the appropriate callback method per node (i.e., for a phrase node, it invokes getPhraseQuery).
          We have several sub-classes which extend this parser to provide their own implementation for various node types (for example, this approach allows you to remove any "prefix" or "wildcard" nodes from the query, without touching the parser).

          Show
          shaie Shai Erera added a comment - I think the idea of sub-classing makes more sense when you're talking about a QueryParser. The parser includes, mainly, two logics - parsing a query into query "segments" or "nodes" and then creating the appropriate Lucene Query instance for each sengment/node. By modifying QueryParser so that it uses protected callback methods for creating the appropriate Query instance for different node types, you get a very clean and elegant query parsing solution. All sub-classes can share the same parsing rules (which generally don't change), and just instantiating their own TermQuery or PrefixQuery instances. Wouldn't you need to do that anyway with the Query factory? I mean, what if I want to create all the query types like the default implementation, but change the PrefixQuery instance to something I wrote? Wouldn't I need to extend the default query factory? Sounds exactly like extending QueryParser to me. I recently implemented a query parser for a different query syntax than Lucene's, but I still wanted to create Lucene Query objects. The parser includes different protected callback methods like getTermQuery, getPhraseQuery, getPrefixQuery etc, providing a default implementation to each of course. It parses the query into "nodes" and then invoke the appropriate callback method per node (i.e., for a phrase node, it invokes getPhraseQuery). We have several sub-classes which extend this parser to provide their own implementation for various node types (for example, this approach allows you to remove any "prefix" or "wildcard" nodes from the query, without touching the parser).
          Hide
          yseeley@gmail.com Yonik Seeley added a comment -

          In this specific instance I think subclassing of QueryParser is the right way to go, and avoids introducing another class. People are going to want more customizations than just changing the types of created subclauses, and will need to subclass for these other types of changes anyway.

          Show
          yseeley@gmail.com Yonik Seeley added a comment - In this specific instance I think subclassing of QueryParser is the right way to go, and avoids introducing another class. People are going to want more customizations than just changing the types of created subclauses, and will need to subclass for these other types of changes anyway.
          Hide
          mikemccand Michael McCandless added a comment -

          OK then let's take the subclassing approach; I'll rework the patch.

          Show
          mikemccand Michael McCandless added a comment - OK then let's take the subclassing approach; I'll rework the patch.
          Hide
          mikemccand Michael McCandless added a comment -

          OK reworked patch to subclass. I plan to commit in a day or two.

          Show
          mikemccand Michael McCandless added a comment - OK reworked patch to subclass. I plan to commit in a day or two.
          Hide
          ehatcher Erik Hatcher added a comment -

          Michael - you are a machine!

          +1 to the subclassing approach and your general patch.

          What might be even more interesting is to make the newXXX methods return Query instead of a specific type. I'm not sure if that would work in all cases (surely not for BooleanQuery), but might for most of 'em.

          For example, what if newTermQuery(Term term) returned a Query instead of a TermQuery? That'd add a fair bit more flexibility, as long as none of the calling code needed a specific type of Query.

          The hoops we jump through because we're in Java.... sheesh.

          Show
          ehatcher Erik Hatcher added a comment - Michael - you are a machine! +1 to the subclassing approach and your general patch. What might be even more interesting is to make the newXXX methods return Query instead of a specific type. I'm not sure if that would work in all cases (surely not for BooleanQuery), but might for most of 'em. For example, what if newTermQuery(Term term) returned a Query instead of a TermQuery? That'd add a fair bit more flexibility, as long as none of the calling code needed a specific type of Query. The hoops we jump through because we're in Java.... sheesh.
          Hide
          mikemccand Michael McCandless added a comment -

          The hoops we jump through because we're in Java.... sheesh.

          Alas, yes... maybe we should all switch to Erlang!

          What might be even more interesting is to make the newXXX methods return Query instead of a specific type. I'm not sure if that would work in all cases (surely not for BooleanQuery), but might for most of 'em.

          OK I was able to do this with many of them – attached new patch. Only PhraseQuery, MultiPhraseQuery, BooleanQuery/Clause required strong typing. All others now return Query.

          Show
          mikemccand Michael McCandless added a comment - The hoops we jump through because we're in Java.... sheesh. Alas, yes... maybe we should all switch to Erlang! What might be even more interesting is to make the newXXX methods return Query instead of a specific type. I'm not sure if that would work in all cases (surely not for BooleanQuery), but might for most of 'em. OK I was able to do this with many of them – attached new patch. Only PhraseQuery, MultiPhraseQuery, BooleanQuery/Clause required strong typing. All others now return Query.
          Hide
          john.wang@gmail.com John Wang added a comment -

          This looks great! Either subclassing or using a factory pattern works well in this case. Great job and thanks!

          Show
          john.wang@gmail.com John Wang added a comment - This looks great! Either subclassing or using a factory pattern works well in this case. Great job and thanks!
          Hide
          mikemccand Michael McCandless added a comment -

          Committed revision 690535. Thanks John!

          Show
          mikemccand Michael McCandless added a comment - Committed revision 690535. Thanks John!

            People

            • Assignee:
              mikemccand Michael McCandless
              Reporter:
              john.wang@gmail.com John Wang
            • Votes:
              1 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development