Solr
  1. Solr
  2. SOLR-356

pluggable functions (value sources)

    Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.3
    • Component/s: None
    • Labels:
      None

      Description

      allow configuration of new value sources ot be created by the function query parser.

      1. pluggableFunctions.patch
        40 kB
        Doug Daniels
      2. pluggableFunctions.patch
        25 kB
        Doug Daniels
      3. pluggableFunctions.patch
        20 kB
        Doug Daniels

        Activity

        Show
        Jon Pierce added a comment - A related thread on the list: http://www.mail-archive.com/solr-user@lucene.apache.org/msg06073.html http://www.nabble.com/pluggable-functions-tf4476995.html
        Hide
        Doug Daniels added a comment -

        This is a quick attempt at pluggable functions using the same style as Yonik's QParser plugins. it takes the path suggested in http://www.nabble.com/forum/ViewPost.jtp?post=12770704&framed=y by letting people register new functions as ValueSourceParser plugins in solrconfig.

        Show
        Doug Daniels added a comment - This is a quick attempt at pluggable functions using the same style as Yonik's QParser plugins. it takes the path suggested in http://www.nabble.com/forum/ViewPost.jtp?post=12770704&framed=y by letting people register new functions as ValueSourceParser plugins in solrconfig.
        Hide
        Yonik Seeley added a comment -

        Thanks for taking a crack at this Doug!

        My initial thought was perhaps to simply have a Map from function name to ValueSource class, and the ValueSource could either
        1) take a List<Object> (or NamedList if we want to start supporting named params like python, etc)
        2) specify the argment list so that the function parser could validate the parameters (but on second thought, I think this could get too complex)

        But your use of ValueSourceParser looks to have advantages, as it's essentially a factory and can act as a virtual constructor, and can be initialized with different static params from config.

        One question would be if we really want to expose StrParser to the ValueSourceParser.
        StrParser is a really quick hack I threw together (that's grown) and I could see it changing in the future (esp if we eventually implement an infix parser). Two ways of isolating the ValueSourceParser from the low level details of parsing I see are:
        1) have a ValueSourceParser.createValueSource(List params), and the function parser would create the list
        and pass it to the parser
        2) keep the current style, and lock down the public APIs on FunctionQParser. Remove some of the details of parsing (like reading separators). So the following code from your patch

            standardValueSourceParsers.put("max", new ValueSourceParser() {
              public ValueSource parse(FunctionQParser fp) throws ParseException {
                ValueSource source = fp.parseValSource();
                fp.getStrParser().expect(",");
                float val = fp.getStrParser().getFloat();
                return new MaxFloatFunction(source,val);
              }
            });
        

        Would look something more like

            standardValueSourceParsers.put("max", new ValueSourceParser() {
              public ValueSource parse(FunctionQParser fp) throws ParseException {
                ValueSource source = fp.getValSource();
                float val = fp.getFloat();
                return new MaxFloatFunction(source,val);
              }
            });
        
        Show
        Yonik Seeley added a comment - Thanks for taking a crack at this Doug! My initial thought was perhaps to simply have a Map from function name to ValueSource class, and the ValueSource could either 1) take a List<Object> (or NamedList if we want to start supporting named params like python, etc) 2) specify the argment list so that the function parser could validate the parameters (but on second thought, I think this could get too complex) But your use of ValueSourceParser looks to have advantages, as it's essentially a factory and can act as a virtual constructor, and can be initialized with different static params from config. One question would be if we really want to expose StrParser to the ValueSourceParser. StrParser is a really quick hack I threw together (that's grown) and I could see it changing in the future (esp if we eventually implement an infix parser). Two ways of isolating the ValueSourceParser from the low level details of parsing I see are: 1) have a ValueSourceParser.createValueSource(List params), and the function parser would create the list and pass it to the parser 2) keep the current style, and lock down the public APIs on FunctionQParser. Remove some of the details of parsing (like reading separators). So the following code from your patch standardValueSourceParsers.put( "max" , new ValueSourceParser() { public ValueSource parse(FunctionQParser fp) throws ParseException { ValueSource source = fp.parseValSource(); fp.getStrParser().expect( "," ); float val = fp.getStrParser().getFloat(); return new MaxFloatFunction(source,val); } }); Would look something more like standardValueSourceParsers.put( "max" , new ValueSourceParser() { public ValueSource parse(FunctionQParser fp) throws ParseException { ValueSource source = fp.getValSource(); float val = fp.getFloat(); return new MaxFloatFunction(source,val); } });
        Hide
        Doug Daniels added a comment -

        I agree that the ValueSourceParser should be isolated from the low-level details like grabbing commas between args.

        I thought about the first option you suggested, and it seems difficult with recursive functions. When the ValueSourceParser.createValueSource method expects another ValueSource as an arg, it would need to invoke whatever code created it (presumably from the FunctionQParser). Alternately, the FunctionQParser could ensure that the innermost functions are run first, passing their completed values out to enclosing functions as params.

        It seems simpler to me to go with the second option though – locking down the API on FunctionQParser. The ValueSourceParser would already have access to FunctionQParser, which it could call when it needs to parse a ValueSource argument.

        What do you think?

        Show
        Doug Daniels added a comment - I agree that the ValueSourceParser should be isolated from the low-level details like grabbing commas between args. I thought about the first option you suggested, and it seems difficult with recursive functions. When the ValueSourceParser.createValueSource method expects another ValueSource as an arg, it would need to invoke whatever code created it (presumably from the FunctionQParser). Alternately, the FunctionQParser could ensure that the innermost functions are run first, passing their completed values out to enclosing functions as params. It seems simpler to me to go with the second option though – locking down the API on FunctionQParser. The ValueSourceParser would already have access to FunctionQParser, which it could call when it needs to parse a ValueSource argument. What do you think?
        Hide
        Doug Daniels added a comment -

        Here's a patch for the second option, hiding the low-level details from ValueSourceParser implementations.

        Show
        Doug Daniels added a comment - Here's a patch for the second option, hiding the low-level details from ValueSourceParser implementations.
        Hide
        Hoss Man added a comment -

        I'm not really following this issue, but in skimming the comments i just wanted to toss out the idea that the approach taken by the Lucene-java xml-query-parser contrib might make sense here.

        it's got the same basic problem: support parsing and building of nested (query/function) structures where user configuration tells you which (query/function) name maps to which implementation.

        the code may not be reusable in this case, but the pattern may be (not that i remember much about what the pattern was, just that it made a lot of sense when it was being fleshed out)

        Show
        Hoss Man added a comment - I'm not really following this issue, but in skimming the comments i just wanted to toss out the idea that the approach taken by the Lucene-java xml-query-parser contrib might make sense here. it's got the same basic problem: support parsing and building of nested (query/function) structures where user configuration tells you which (query/function) name maps to which implementation. the code may not be reusable in this case, but the pattern may be (not that i remember much about what the pattern was, just that it made a lot of sense when it was being fleshed out)
        Hide
        Doug Daniels added a comment -

        I found that thread at http://marc.info/?l=lucene-dev&m=113355526731460&w=2. It's quite a lengthy thread, but from what I read I agree that it's trying to solve a similar problem (plus a few additional problems that solr has since solved nicely).

        Perhaps it's just personal preference, but I find XML description of functions, though powerful and expressive, quite a bit clunky. I far prefer the functional style for this sort of task.

        Maybe I'm missing something from the recommendation though. Were you recommending using XML to express the functions themselves, or something else about the xml-query-parser?

        Show
        Doug Daniels added a comment - I found that thread at http://marc.info/?l=lucene-dev&m=113355526731460&w=2 . It's quite a lengthy thread, but from what I read I agree that it's trying to solve a similar problem (plus a few additional problems that solr has since solved nicely). Perhaps it's just personal preference, but I find XML description of functions, though powerful and expressive, quite a bit clunky. I far prefer the functional style for this sort of task. Maybe I'm missing something from the recommendation though. Were you recommending using XML to express the functions themselves, or something else about the xml-query-parser?
        Hide
        Hoss Man added a comment -

        Maybe I'm missing something from the recommendation though. Were you recommending using XML to express the functions themselves, or something else about the xml-query-parser?

        I was not suggesting an XML syntax ... just that the approach the xml-query-parser takes to deal with recursively parsing/eval-ing the XML structure using "user" configured implementations for each type of XML node seems to map closely to the idea of recusively parsing/eval-ing parenthetical function syntax using "user" configured implantations for each function name.

        In both cases the XML/paren parsing is trivial, it's deciding how to let the "user" tell you what Java objects to build based on each node/function name that gets interesting.

        (like i said, i haven't looked at the xml-query-parser code since it was orriginally being written .. i may be over romanticizing the idea behind it's design and how applicable it could be in this case)

        Show
        Hoss Man added a comment - Maybe I'm missing something from the recommendation though. Were you recommending using XML to express the functions themselves, or something else about the xml-query-parser? I was not suggesting an XML syntax ... just that the approach the xml-query-parser takes to deal with recursively parsing/eval-ing the XML structure using "user" configured implementations for each type of XML node seems to map closely to the idea of recusively parsing/eval-ing parenthetical function syntax using "user" configured implantations for each function name. In both cases the XML/paren parsing is trivial, it's deciding how to let the "user" tell you what Java objects to build based on each node/function name that gets interesting. (like i said, i haven't looked at the xml-query-parser code since it was orriginally being written .. i may be over romanticizing the idea behind it's design and how applicable it could be in this case)
        Hide
        Yonik Seeley added a comment -

        Looking good! I think the current API is straight forward and relatively easy to support, even if we changed underlying parsing technologies.

        I think all we need now is a test that exercises plugging in a new function from solrconfig.xml...

        Show
        Yonik Seeley added a comment - Looking good! I think the current API is straight forward and relatively easy to support, even if we changed underlying parsing technologies. I think all we need now is a test that exercises plugging in a new function from solrconfig.xml...
        Hide
        Doug Daniels added a comment -

        Added a sample ValueSourceParser plugin and some tests for it in TestFunctionQuery. The sample plugin is for an "nvl" function that replaces a null value in a doc with a parameter float value. It works much like the oracle SQL function of the same name. It also takes in an initialization parameter to test that functionality out.

        I also made TestFunctionQuery use a new copy of solrconfig.xml (in solrconfig-functionquery.xml) to avoid polluting the standard one with plugins.

        Show
        Doug Daniels added a comment - Added a sample ValueSourceParser plugin and some tests for it in TestFunctionQuery. The sample plugin is for an "nvl" function that replaces a null value in a doc with a parameter float value. It works much like the oracle SQL function of the same name. It also takes in an initialization parameter to test that functionality out. I also made TestFunctionQuery use a new copy of solrconfig.xml (in solrconfig-functionquery.xml) to avoid polluting the standard one with plugins.
        Hide
        Yonik Seeley added a comment -

        I just committed this. Thanks Doug!

        Show
        Yonik Seeley added a comment - I just committed this. Thanks Doug!
        Hide
        Hoss Man added a comment -

        This bug was modified as part of a bulk update using the criteria...

        • Marked "Resolved" and "Fixed"
        • Had no "Fix Version" versions
        • Was listed in the CHANGES.txt for 1.3 as of today 2008-03-15

        The Fix Version for all 29 issues found was set to 1.3, email notification was suppressed to prevent excessive email.

        For a list of all the issues modified, search jira comments for this (hopefully) unique string: batch20070315hossman1

        Show
        Hoss Man added a comment - This bug was modified as part of a bulk update using the criteria... Marked "Resolved" and "Fixed" Had no "Fix Version" versions Was listed in the CHANGES.txt for 1.3 as of today 2008-03-15 The Fix Version for all 29 issues found was set to 1.3, email notification was suppressed to prevent excessive email. For a list of all the issues modified, search jira comments for this (hopefully) unique string: batch20070315hossman1

          People

          • Assignee:
            Unassigned
            Reporter:
            Yonik Seeley
          • Votes:
            2 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development