Details

    • Type: Improvement Improvement
    • Status: Reopened
    • Priority: Minor Minor
    • Resolution: Unresolved
    • Affects Version/s: 1.3
    • Fix Version/s: None
    • Component/s: update
    • Labels:
      None

      Description

      From J.J. Larrea on SOLR-104

      2. What would make this even more powerful would be the ability to "subclass" (meaning refine and/or extend) request handler configs: If the requestHandler element allowed an attribute extends="<another-requesthandler-name>" and chained the SolrParams, then one could do something like:

      <requestHandler name="search/products/all" class="solr.DisMaxRequestHandler" >
      <lst name="defaults">
      <float name="tie">0.01</float>
      <str name="qf">
      text^0.5 features^1.0 name^1.2 sku^1.5 id^10.0 manu^1.1 cat^1.4
      </str>
      ... much more, per the "dismax" example in the sample solrconfig.xml ...
      </requestHandler>

      ... and replacing the "partitioned" example ...
      <requestHandler name="search/products/instock" extends="search/products/all" >
      <lst name="appends">
      <str name="fq">inStock:true</str>
      </lst>
      </requestHandler>

      1. SOLR-112.patch
        10 kB
        Ryan McKinley

        Issue Links

          Activity

          Ryan McKinley created issue -
          Ryan McKinley made changes -
          Field Original Value New Value
          Attachment SOLR-112.patch [ 12349065 ]
          Hide
          Ryan McKinley added a comment -

          I attached a simple implementaion. This includes the NamedList changes from SOLR-107

          I could not figure out any good way to test this with unit tests. Any ideas?

          Show
          Ryan McKinley added a comment - I attached a simple implementaion. This includes the NamedList changes from SOLR-107 I could not figure out any good way to test this with unit tests. Any ideas?
          Hide
          Hoss Man added a comment -

          Damn Ryan ... you keep taking on cool features nad churning out patches too fast for me to read them!

          this sounds like a cool idea, the one big caveat is documenting exactly how the NamedList "merge" method you wrote is expected to work ... ie:

          • what it does if both named lists have the same key?
          • does it do deep merging of nested named list/collections?
          • what does it do if one list has an element without a name (first and formost a
            NamedLIst is an order list after all - the names are optional)

          ..as far as unit tests go, the easiest way to test something like this is to start by writing a unit test of just the NamedList mergin logic – independent of anything else (this class would be a good place to put a test of the SOLR-107 changes too by the way).

          next would be to test that the merge logic is getting used as you expect, with a test that uses a config file with several handlers all inheriting various properties from one another, and then a test that does queries against them – the easiest way to do validate that the init params were getting inherited correctly would probably be to use an "EchoConfigRequestHandler" that looked something like this...

          public class EchoConfigRequestHandler impliments SolrRequestHandler {
          private NamedList initParams;
          public void init(NamedList nl) { initParams = nl);
          public void handleRequest(SOlrQueryRequest req, SolrQueryResponse rsp)

          { rsp.add(initParams); }

          }

          the AbstractSolrTestClass makes it easy for you to use any solrconfig file you want by overriding a method – so you could even write one test case using one config file with lots of examples of inherited init params and a test method for each asserting that the params are what is expected, and then subclass it with another test class instance that's exactly the same except for the using a differnet solrconfig file where all of hte params are duplicated out explicitly – testing your assumptions so to speak.

          Show
          Hoss Man added a comment - Damn Ryan ... you keep taking on cool features nad churning out patches too fast for me to read them! this sounds like a cool idea, the one big caveat is documenting exactly how the NamedList "merge" method you wrote is expected to work ... ie: what it does if both named lists have the same key? does it do deep merging of nested named list/collections? what does it do if one list has an element without a name (first and formost a NamedLIst is an order list after all - the names are optional) ..as far as unit tests go, the easiest way to test something like this is to start by writing a unit test of just the NamedList mergin logic – independent of anything else (this class would be a good place to put a test of the SOLR-107 changes too by the way). next would be to test that the merge logic is getting used as you expect, with a test that uses a config file with several handlers all inheriting various properties from one another, and then a test that does queries against them – the easiest way to do validate that the init params were getting inherited correctly would probably be to use an "EchoConfigRequestHandler" that looked something like this... public class EchoConfigRequestHandler impliments SolrRequestHandler { private NamedList initParams; public void init(NamedList nl) { initParams = nl); public void handleRequest(SOlrQueryRequest req, SolrQueryResponse rsp) { rsp.add(initParams); } } the AbstractSolrTestClass makes it easy for you to use any solrconfig file you want by overriding a method – so you could even write one test case using one config file with lots of examples of inherited init params and a test method for each asserting that the params are what is expected, and then subclass it with another test class instance that's exactly the same except for the using a differnet solrconfig file where all of hte params are duplicated out explicitly – testing your assumptions so to speak.
          Hide
          Hoss Man added a comment -

          random idea i had that we might consider, not sure yet if i like it yet but i wanted to throw it out there...

          if someone has..

          <requestHandler name="foo" class="FooHandler"> ... <requestHandler>
          <requestHandler name="bar" extends="foo"> ... <requestHandler>
          <requestHandler name="foo/baz"> ... <requestHandler>

          (NOTE: foo/baz has no class or extends)

          could/should we assume that "foo/baz" extends "foo" since it's a prefix of the name?

          Show
          Hoss Man added a comment - random idea i had that we might consider, not sure yet if i like it yet but i wanted to throw it out there... if someone has.. <requestHandler name="foo" class="FooHandler"> ... <requestHandler> <requestHandler name="bar" extends="foo"> ... <requestHandler> <requestHandler name="foo/baz"> ... <requestHandler> (NOTE: foo/baz has no class or extends) could/should we assume that "foo/baz" extends "foo" since it's a prefix of the name?
          Hide
          Ryan McKinley added a comment -

          I think that path should be specified explicitly.

          I like that
          <requestHandler name="foo" class="FooHandler"> ... <requestHandler>

          will only match /select?wt=foo

          and that:
          <requestHandler name="/foo" class="FooHandler"> ... <requestHandler>

          will match
          /foo (and /select?wt=/foo)

          I like the idea that somone adding the prefix '/' is an explicit gesture they want to set the URL path. (even if it overrides something else, for example /admin)

          Show
          Ryan McKinley added a comment - I think that path should be specified explicitly. I like that <requestHandler name="foo" class="FooHandler"> ... <requestHandler> will only match /select?wt=foo and that: <requestHandler name="/foo" class="FooHandler"> ... <requestHandler> will match /foo (and /select?wt=/foo) I like the idea that somone adding the prefix '/' is an explicit gesture they want to set the URL path. (even if it overrides something else, for example /admin)
          Hide
          J.J. Larrea added a comment -

          I'm sure you won't like your extemperaneous suggestion (foo/baz implicitly extending foo with baz) once you get a chance to think about it, Hoss.

          The concern of efficiently structuring request handlers in solrconfig is quite different from he concern of publishing them to the outside world. For example, mightn't one set up the equivalent of an "abstract base class" request config which has no value being invoked directly in a request, but has great value as the root of a tree of request configs which will be invoked? And similarly, shouldn't one be able to rearrange the internal configuration (e.g. refactoring) without affecting an already "published" request syntax?

          If it didn't break backwards compatibility, one could even consider having separate arguments defining an internal name (used for extending) and an external name (used for invoking), with either one being optional – allowing configs which are uninvokable but extendable, and vice versa. Or for better backwards compatibility, one name could default to the other, but could be explicitly overridden (potentially to the empty string) if so desired. I am not advocating either of these approaches (simpler is perhaps better) as much as using them to illustrate the separability of the concerns.

          Does this make sense?

          Show
          J.J. Larrea added a comment - I'm sure you won't like your extemperaneous suggestion (foo/baz implicitly extending foo with baz) once you get a chance to think about it, Hoss. The concern of efficiently structuring request handlers in solrconfig is quite different from he concern of publishing them to the outside world. For example, mightn't one set up the equivalent of an "abstract base class" request config which has no value being invoked directly in a request, but has great value as the root of a tree of request configs which will be invoked? And similarly, shouldn't one be able to rearrange the internal configuration (e.g. refactoring) without affecting an already "published" request syntax? If it didn't break backwards compatibility, one could even consider having separate arguments defining an internal name (used for extending) and an external name (used for invoking), with either one being optional – allowing configs which are uninvokable but extendable, and vice versa. Or for better backwards compatibility, one name could default to the other, but could be explicitly overridden (potentially to the empty string) if so desired. I am not advocating either of these approaches (simpler is perhaps better) as much as using them to illustrate the separability of the concerns. Does this make sense?
          Hide
          J.J. Larrea added a comment -

          Re foo vs. /foo:

          I think of the SolrServlet as being just one way to invoke the request dispatcher. One could for example write a SOAP or other RPC message receiver which called a method something like handleRequest(String reqName, SolrQueryRequest req, SolrQueryResponse rsp)1. So I wouldn't want to bind the request invocation syntax too tightly to a URL-based mechanism for invocation.

          Similarly, I think of allowing slashes in request handler names as merely a convention; it could be "search.products.instock" or "search-products-instock" just as easily. Of course, it is advantageous for the handler name to be RFC1738-compliant (as those examples both are) so the pathInfo can be used to set the name, as we all like, e.g. http://localhost:8989/solr/select/search-products-instock

          What your suggestion comes down to, Ryan, is whether the pathInfo-parsed request adds a leading / slash to the request name, or not. If it does it forces URL syntax into the request-naming space, and while that won't particularly hurt anything I'm not sure it buys anything either... Why should a SOLR configurer need to make an explicit gesture to indicate they want to use the more "modern" pathInfo-based invocation style rather than the older qt= invocation style? And shouldn't the request handler definition be either agnostic as to the request method (GET, POST, pathINFO, qt=, SOAP, direct API call, ...) or else have access to a more comprehensive mechanism for filtering which methods they respond to?

          1 (I haven't yet had a chance to catch up on the voluminous SOLR-104 discussion so this may not be the currently contemplated syntax, but hopefully my argument for potentially supporting non-URL-based invokers still holds.)

          Show
          J.J. Larrea added a comment - Re foo vs. /foo: I think of the SolrServlet as being just one way to invoke the request dispatcher. One could for example write a SOAP or other RPC message receiver which called a method something like handleRequest(String reqName, SolrQueryRequest req, SolrQueryResponse rsp) 1 . So I wouldn't want to bind the request invocation syntax too tightly to a URL-based mechanism for invocation. Similarly, I think of allowing slashes in request handler names as merely a convention; it could be "search.products.instock" or "search-products-instock" just as easily. Of course, it is advantageous for the handler name to be RFC1738-compliant (as those examples both are) so the pathInfo can be used to set the name, as we all like, e.g. http://localhost:8989/solr/select/search-products-instock What your suggestion comes down to, Ryan, is whether the pathInfo-parsed request adds a leading / slash to the request name, or not. If it does it forces URL syntax into the request-naming space, and while that won't particularly hurt anything I'm not sure it buys anything either... Why should a SOLR configurer need to make an explicit gesture to indicate they want to use the more "modern" pathInfo-based invocation style rather than the older qt= invocation style? And shouldn't the request handler definition be either agnostic as to the request method (GET, POST, pathINFO, qt=, SOAP, direct API call, ...) or else have access to a more comprehensive mechanism for filtering which methods they respond to? 1 (I haven't yet had a chance to catch up on the voluminous SOLR-104 discussion so this may not be the currently contemplated syntax, but hopefully my argument for potentially supporting non-URL-based invokers still holds.)
          Hide
          J.J. Larrea added a comment -

          Regarding SOLR-112.patch, +1 on on the NamedList changes included from SOLR-107.

          But I'm not sure a "blind" NamedList merging is going to accomplish what is needed for an extends mechanism. As it stands that merges defaults into defaults, invariants into invariants, appends into appends, along with any other named lists that happen to be in the base list. That could stand some further thought...

          First, each existing SolrRequestHandler implementation (Standard and DisMax) in init() now parses defaults, appends, and invariants sub-NamedLists into individual SolrParams, and in handleRequest passes those and the request SolrParams into SolrPluginUtils.setDefaults, which sets up a dynamic chain:

          params => invariant-parms -> ((request-params -> default-params) + appended-params)

          such that invariant-params override request-params which override default-params, with appended-params appended before the invariants override. It's pretty cool.

          The semantics of extended request handler configs may perhaps need to be similar, with the extender's invariant params overriding any params in the extendee, extendee invariants perhaps overriding non-invariants in the extender , extender defaults overriding extendee defaults, and the appends concatenated. I suppose given the above rule, it would simply be:

          invariant-params => extender-invariants -> extendee-invariants
          default-params => extender-defaults -> extendee defaults
          appended-params => extender-appends + extendee-appends

          (answering Hoss' question about what should happen if the keys are identical, but not his other questions)

          In some ways this would be neatest if implemented at the SolrParams level, since there are nice classes like DefaultSolrParams and AppendedSolrParams, implementing the -> and + operators I used above, respectively, and which have informative toString()s. Unfortunately that's not so simple without serious refactoring, since SolrRequestHandler.init takes a NamedList; as defined, a handler could look for handler-specific initialization data in the NamedList, and also doesn't necessarily need to parse and chain invariants, defaults, and appends.

          I suppose this would still work at the NamedList level, as long as there were a flag for how to treat key conflicts (override vs. append), and that key were set specifically for invariants and defaults, or just appends.

          If done that way, it still leaves the question of whether the "merge" is done statically by changing the NamedList, or dynamically by allowing NamedLists to be chained. I suppose the only advantage of chaining is that toString() would be more informative, and alone that's probably not sufficient to justify the additional complexity.

          Anyway, those are some ideas for discussion.

          Show
          J.J. Larrea added a comment - Regarding SOLR-112 .patch, +1 on on the NamedList changes included from SOLR-107 . But I'm not sure a "blind" NamedList merging is going to accomplish what is needed for an extends mechanism. As it stands that merges defaults into defaults, invariants into invariants, appends into appends, along with any other named lists that happen to be in the base list. That could stand some further thought... First, each existing SolrRequestHandler implementation (Standard and DisMax) in init() now parses defaults, appends, and invariants sub-NamedLists into individual SolrParams, and in handleRequest passes those and the request SolrParams into SolrPluginUtils.setDefaults, which sets up a dynamic chain: params => invariant-parms -> ((request-params -> default-params) + appended-params) such that invariant-params override request-params which override default-params, with appended-params appended before the invariants override. It's pretty cool. The semantics of extended request handler configs may perhaps need to be similar, with the extender's invariant params overriding any params in the extendee, extendee invariants perhaps overriding non-invariants in the extender , extender defaults overriding extendee defaults, and the appends concatenated. I suppose given the above rule, it would simply be: invariant-params => extender-invariants -> extendee-invariants default-params => extender-defaults -> extendee defaults appended-params => extender-appends + extendee-appends (answering Hoss' question about what should happen if the keys are identical, but not his other questions) In some ways this would be neatest if implemented at the SolrParams level, since there are nice classes like DefaultSolrParams and AppendedSolrParams, implementing the -> and + operators I used above, respectively, and which have informative toString()s. Unfortunately that's not so simple without serious refactoring, since SolrRequestHandler.init takes a NamedList; as defined, a handler could look for handler-specific initialization data in the NamedList, and also doesn't necessarily need to parse and chain invariants, defaults, and appends. I suppose this would still work at the NamedList level, as long as there were a flag for how to treat key conflicts (override vs. append), and that key were set specifically for invariants and defaults, or just appends. If done that way, it still leaves the question of whether the "merge" is done statically by changing the NamedList, or dynamically by allowing NamedLists to be chained. I suppose the only advantage of chaining is that toString() would be more informative, and alone that's probably not sufficient to justify the additional complexity. Anyway, those are some ideas for discussion.
          Hide
          Hoss Man added a comment -

          I think you're dead on JJ ... any generic NamedList merging won't neccessarily do "the right thing" in all cases when talking about RequestHandler init args – very special logic would need to be used to deal with the defaults/appends/invarients in a logical manner, and that logic may not be bale to take into account other init params that other RequestHandlers (subclasses of the core ones perhaps) might add.

          a cleaner way to deal with this might just be to have the individual RequestHandlers manage this themselves – using SolrCore.getRequestHandler(String) and protected methods they explicitly support to allow other instances to get access to their SolrParams.

          ie...

          <requestHandler name="search/products/all" class="solr.DisMaxRequestHandler" >
          <lst name="defaults">
          <str name="facet.field">category</str>
          <float name="tie">0.01</float>
          <str name="qf">
          text^0.5 features^1.0 name^1.2 sku^1.5 id^10.0 manu^1.1 cat^1.4
          </str>
          </requestHandler>

          <requestHandler name="search/products/instock" class="solr.DisMaxRequestHandler">
          <str name="extends">search/products/all</str>
          <lst name="defaults">
          <str name="facet.query">price:[0 TO 100]</str>
          <str name="facet.query">price:[100 TO *]</str>
          </lst>
          <lst name="appends">
          <str name="fq">inStock:true</str>
          </lst>
          </requestHandler>

          ...where DisMaxRequestHandler (or most likely teh new Base class Ryan has written) has methods like...

          protected SolrParams getDefaults()
          protected SolrParams getInvarients()
          protected SolrParams getAppends()

          ...and the init method looks for an "extends" arg, if it's there fetches it from the SolrCore, tests it's class and casts it, then calls the methods above and builds up it's own SolrParams usign a combination of those and the ones explicitly specified in it's config.

          Show
          Hoss Man added a comment - I think you're dead on JJ ... any generic NamedList merging won't neccessarily do "the right thing" in all cases when talking about RequestHandler init args – very special logic would need to be used to deal with the defaults/appends/invarients in a logical manner, and that logic may not be bale to take into account other init params that other RequestHandlers (subclasses of the core ones perhaps) might add. a cleaner way to deal with this might just be to have the individual RequestHandlers manage this themselves – using SolrCore.getRequestHandler(String) and protected methods they explicitly support to allow other instances to get access to their SolrParams. ie... <requestHandler name="search/products/all" class="solr.DisMaxRequestHandler" > <lst name="defaults"> <str name="facet.field">category</str> <float name="tie">0.01</float> <str name="qf"> text^0.5 features^1.0 name^1.2 sku^1.5 id^10.0 manu^1.1 cat^1.4 </str> </requestHandler> <requestHandler name="search/products/instock" class="solr.DisMaxRequestHandler"> <str name="extends">search/products/all</str> <lst name="defaults"> <str name="facet.query">price: [0 TO 100] </str> <str name="facet.query">price: [100 TO *] </str> </lst> <lst name="appends"> <str name="fq">inStock:true</str> </lst> </requestHandler> ...where DisMaxRequestHandler (or most likely teh new Base class Ryan has written) has methods like... protected SolrParams getDefaults() protected SolrParams getInvarients() protected SolrParams getAppends() ...and the init method looks for an "extends" arg, if it's there fetches it from the SolrCore, tests it's class and casts it, then calls the methods above and builds up it's own SolrParams usign a combination of those and the ones explicitly specified in it's config.
          Ryan McKinley made changes -
          Affects Version/s 1.3 [ 12312486 ]
          Affects Version/s 1.2 [ 12312235 ]
          Ryan McKinley made changes -
          Fix Version/s 1.3 [ 12312486 ]
          Fix Version/s 1.2 [ 12312235 ]
          Mike Klaas made changes -
          Fix Version/s 1.3 [ 12312486 ]
          Hoss Man made changes -
          Link This issue is duplicated by SOLR-3293 [ SOLR-3293 ]
          Hide
          Erick Erickson added a comment -

          2013 Old JIRA cleanup

          Show
          Erick Erickson added a comment - 2013 Old JIRA cleanup
          Erick Erickson made changes -
          Status Open [ 1 ] Resolved [ 5 ]
          Resolution Won't Fix [ 2 ]
          Erick Erickson made changes -
          Resolution Won't Fix [ 2 ]
          Status Resolved [ 5 ] Reopened [ 4 ]

            People

            • Assignee:
              Unassigned
              Reporter:
              Ryan McKinley
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:

                Development