Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.3
    • Component/s: None
    • Labels:
      None

      Description

      One should be able to optionally specify an alternate query syntax on a per-query basis
      http://www.nabble.com/Using-HTTP-Post-for-Queries-tf3039973.html#a8483387
      Many benefits, including avoiding the need to do query parser escaping for simple term or prefix queries.
      Possible Examples:
      fq=<!term field="myfield">The Term
      fq=<!prefix field="myfield">The Prefix
      q=<!qp op="AND">a b
      q=<!xml><?xml...> // lucene XML query syntax?

      1. angle2curly.patch
        13 kB
        Yonik Seeley
      2. qparser.patch
        90 kB
        Yonik Seeley
      3. qparser.patch
        97 kB
        Yonik Seeley
      4. qparser.patch
        70 kB
        Yonik Seeley

        Issue Links

          Activity

          Hide
          Yonik Seeley added a comment -

          could be a way to mostly unify what are currently different request handlers and parameters

          <!dismax>the user query goes here
          <!function>recip(rord(datefield),1,2,3)

          Optionally, dismax we could perhaps allow selective overriding of the dismax params
          <!dismax mm=".5">another dismax query
          This would allow any query handler that needed to specify more than one query to be able to use more than one dismax query.

          Show
          Yonik Seeley added a comment - could be a way to mostly unify what are currently different request handlers and parameters <!dismax>the user query goes here <!function>recip(rord(datefield),1,2,3) Optionally, dismax we could perhaps allow selective overriding of the dismax params <!dismax mm=".5">another dismax query This would allow any query handler that needed to specify more than one query to be able to use more than one dismax query.
          Hide
          Yonik Seeley added a comment -

          Another easy-yet-useful feature would be parameter substitution. Great for separating the user query from what you do with it.

          If the userq parameter contained the raw user query, one could specify a dismax query via
          q=<!dismax value=$userq>

          Show
          Yonik Seeley added a comment - Another easy-yet-useful feature would be parameter substitution. Great for separating the user query from what you do with it. If the userq parameter contained the raw user query, one could specify a dismax query via q=<!dismax value=$userq>
          Hide
          Yonik Seeley added a comment -

          It occurs to me that this would help out gather together other parameters also.

          Instead of
          facet.field=category&f.category.facet.offset=100&f.category.facet.limit=10&f.category.facet.sort=false&f.category.facet.mincount=1

          We could have
          facet.field=<!offset=100 limit=10 sort=false mincount=1>category
          OR
          facet.field=<!field=category offset=100 limit=10 sort=false mincount=1>

          Show
          Yonik Seeley added a comment - It occurs to me that this would help out gather together other parameters also. Instead of facet.field=category&f.category.facet.offset=100&f.category.facet.limit=10&f.category.facet.sort=false&f.category.facet.mincount=1 We could have facet.field=<!offset=100 limit=10 sort=false mincount=1>category OR facet.field=<!field=category offset=100 limit=10 sort=false mincount=1>
          Hide
          Yonik Seeley added a comment -

          OK, heres a prototype... still needs some cleaning up and testing

          Examples:
          <!lucene q.op=AND df=myfield>
          <!sort='price asc' start=100 rows=10>foo
          <!dismax>hi there
          <!dismax v=$userq> // indirection - userq is loaded from the other params
          <!prefix f=myfield>the unescaped prefix

          Query boosted by function (multiplied)
          <!boost b=sqrt(popularity)>foo:bar
          <!boost b=popularity defType=dismax>user query terms // set default query type for nested query

          Nested Queries in Lucene syntax:
          +foo +bar query:"<!dismax>a b"

          Nested Queries in FunctionQuery syntax (untested)
          sqrt(log(query($q,1.0)))
          sqrt(log(query(<!lucene v=$myq>,1.0)))

          You can't currently override any parameters for dismax, and I've only changed the standard request handler's main query to support this syntax. You should be able to register your own query plugins, but I haven't tested that yet.

          Show
          Yonik Seeley added a comment - OK, heres a prototype... still needs some cleaning up and testing Examples: <!lucene q.op=AND df=myfield> <!sort='price asc' start=100 rows=10>foo <!dismax>hi there <!dismax v=$userq> // indirection - userq is loaded from the other params <!prefix f=myfield>the unescaped prefix Query boosted by function (multiplied) <!boost b=sqrt(popularity)>foo:bar <!boost b=popularity defType=dismax>user query terms // set default query type for nested query Nested Queries in Lucene syntax: +foo +bar query :"<!dismax>a b" Nested Queries in FunctionQuery syntax (untested) sqrt(log(query($q,1.0))) sqrt(log(query(<!lucene v=$myq>,1.0))) You can't currently override any parameters for dismax, and I've only changed the standard request handler's main query to support this syntax. You should be able to register your own query plugins, but I haven't tested that yet.
          Hide
          Yonik Seeley added a comment -

          refresh of patch, including external value source since it overlaps.

          • changed getValueSource so it accepted a qparser instance (SolrCore was needed for external value source patch)
          • updates to dismax parser so bq and bf parsing uses the qparser
          Show
          Yonik Seeley added a comment - refresh of patch, including external value source since it overlaps. changed getValueSource so it accepted a qparser instance (SolrCore was needed for external value source patch) updates to dismax parser so bq and bf parsing uses the qparser
          Hide
          Yonik Seeley added a comment -

          re localParams: right now they are map<String,String> (I wanted to keep it lightweight)
          but I could perhaps see one wanting multi-valued params in the future.
          Should localParams be changed to map<String,String[]> or to SolrParams?
          Thoughts?

          Also, any syntax thoughts/improvements?
          I like <!a=1 b=2> fine in cleartext, but if you need to configure it in XML, then it's a bit uglier because of the needed escaping, but I'm not sure if it warrants a change to something else for that alone.

          Show
          Yonik Seeley added a comment - re localParams: right now they are map<String,String> (I wanted to keep it lightweight) but I could perhaps see one wanting multi-valued params in the future. Should localParams be changed to map<String,String[]> or to SolrParams? Thoughts? Also, any syntax thoughts/improvements? I like <!a=1 b=2> fine in cleartext, but if you need to configure it in XML, then it's a bit uglier because of the needed escaping, but I'm not sure if it warrants a change to something else for that alone.
          Hide
          Yonik Seeley added a comment -

          Attaching latest patch:

          • more tests + javadocs
          • "raw" and "field" query parsers
          • "dismax" parser can get it's params from local params
          • localParams changed from map<String,String> to SolrParams for future flexibility

          I'll commit in the next few days barring objections.

          Show
          Yonik Seeley added a comment - Attaching latest patch: more tests + javadocs "raw" and "field" query parsers "dismax" parser can get it's params from local params localParams changed from map<String,String> to SolrParams for future flexibility I'll commit in the next few days barring objections.
          Hide
          Yonik Seeley added a comment -

          I just committed this.

          TODO: document local params syntax and perhaps pull together single page doc on available query types

          Show
          Yonik Seeley added a comment - I just committed this. TODO: document local params syntax and perhaps pull together single page doc on available query types
          Hide
          Otis Gospodnetic added a comment -

          Yonik - it looks like this was committed but left open (for your last TODO?)

          Show
          Otis Gospodnetic added a comment - Yonik - it looks like this was committed but left open (for your last TODO?)
          Hide
          Yonik Seeley added a comment -

          So I recently added a nested query parser... it's useful to be able to allow the client to specify query parts but not know about them.

          So a client could send bf=<!query v=$dateboost> to add a date boost, but the actual date boost query could be configured as a default on the server: dateboost=<!func>recip(rord(datefield,1,1000,1000))

          I'm finding the local params stuff very useful, but I hate the fact that when I type the following URL in firefox, it transforms all the special chars. It makes it very hard to edit (I use a browser a lot for testing). Also, < would need to be escaped in any XML config too.

          Example: I type in
          http://localhost:8983/solr/select?q=<!dismax qf='title^10 body'>foo
          But then firefox transforms it to
          http://localhost:8983/solr/select?q=%3C!dismax%20qf='title^10%20body'%3Efoo

          So while things are still changeable (before a release), is this really the right syntax?
          We could alternately go with [! which doesn't have this problem (and wouldn't have to be escaped in XML config either).
          So it could look like:
          http://localhost:8983/solr/select?q=[!dismax qf='title^10 body']foo
          Which firefox changes to
          http://localhost:8983/solr/select?q=[!dismax%20qf='title^10%20body']foo

          Thoughts?

          Show
          Yonik Seeley added a comment - So I recently added a nested query parser... it's useful to be able to allow the client to specify query parts but not know about them. So a client could send bf=<!query v=$dateboost> to add a date boost, but the actual date boost query could be configured as a default on the server: dateboost=<!func>recip(rord(datefield,1,1000,1000)) I'm finding the local params stuff very useful, but I hate the fact that when I type the following URL in firefox, it transforms all the special chars. It makes it very hard to edit (I use a browser a lot for testing). Also, < would need to be escaped in any XML config too. Example: I type in http://localhost:8983/solr/select?q= <!dismax qf='title^10 body'>foo But then firefox transforms it to http://localhost:8983/solr/select?q=%3C!dismax%20qf='title ^10%20body'%3Efoo So while things are still changeable (before a release), is this really the right syntax? We could alternately go with [! which doesn't have this problem (and wouldn't have to be escaped in XML config either). So it could look like: http://localhost:8983/solr/select?q=[!dismax qf='title^10 body']foo Which firefox changes to http://localhost:8983/solr/select?q=[!dismax%20qf='title ^10%20body']foo Thoughts?
          Hide
          Hoss Man added a comment -

          I'd vote for anything but "<" ... mainly because of the XML similarity and escaping needed..

          square brackets or curly braces are fine with me ... we could even go with a combo approach to reduce the likelyhood of collision with any existing/future syntax people want to write QParser plugins for....

          bf={[!query v=$dateboost]}

          ...OR... and call me crazy here ... we could make the actual start/end tokens be configurable (could make sense as a <requestParser> option since using different markup per handler seems like overkill)

          Show
          Hoss Man added a comment - I'd vote for anything but "<" ... mainly because of the XML similarity and escaping needed.. square brackets or curly braces are fine with me ... we could even go with a combo approach to reduce the likelyhood of collision with any existing/future syntax people want to write QParser plugins for.... bf={[!query v=$dateboost]} ...OR... and call me crazy here ... we could make the actual start/end tokens be configurable (could make sense as a <requestParser> option since using different markup per handler seems like overkill)
          Hide
          Yonik Seeley added a comment -

          OK, I'm thinking of changing it to
          [!query v=$dateboost]

          At some point, more configurability has more drawbacks than benefits... if someone really needs a different escape for this stuff, then that can be done later. In the unlikely event of a future syntax collision, one can still easily escape the real query string by prepending a space.

          Show
          Yonik Seeley added a comment - OK, I'm thinking of changing it to [!query v=$dateboost] At some point, more configurability has more drawbacks than benefits... if someone really needs a different escape for this stuff, then that can be done later. In the unlikely event of a future syntax collision, one can still easily escape the real query string by prepending a space.
          Hide
          Hoss Man added a comment -

          well, at hte very least i would suggest using

          {...}

          instead of [...] since square brackets already have meaning in the primary query parser syntax.

          from a huffman encoding standpoint, i would also argue that a multi character delimiter (ie:

          {[...]}

          ) is better then a single character since it's the atypical behavior. things people type frequently should be easier then the things they type infrequently ... let's not make this too easy.

          Show
          Hoss Man added a comment - well, at hte very least i would suggest using {...} instead of [...] since square brackets already have meaning in the primary query parser syntax. from a huffman encoding standpoint, i would also argue that a multi character delimiter (ie: {[...]} ) is better then a single character since it's the atypical behavior. things people type frequently should be easier then the things they type infrequently ... let's not make this too easy.
          Hide
          Yonik Seeley added a comment - - edited

          It's not a single character at the start... it is currently <!
          I actually like the look of the curly braces (looks like a map), but using both types at once would make me always forget which came first (and it's uglier).

          So here are some more ideas:

          !{a=1}
          {{a=1}}
          [[a=1]]
          
          Show
          Yonik Seeley added a comment - - edited It's not a single character at the start... it is currently <! I actually like the look of the curly braces (looks like a map), but using both types at once would make me always forget which came first (and it's uglier). So here are some more ideas: !{a=1} {{a=1}} [[a=1]]
          Hide
          Yonik Seeley added a comment -

          Note that the middle option (double curly braces) didn't render correctly in JIRA, so I'd eliminate that one.

          Show
          Yonik Seeley added a comment - Note that the middle option (double curly braces) didn't render correctly in JIRA, so I'd eliminate that one.
          Hide
          Hoss Man added a comment -

          i'd put the "!" inside the delimiter ...

          {!query a=1}

          seems better to me then !

          {query a=1}
          Show
          Hoss Man added a comment - i'd put the "!" inside the delimiter ... {!query a=1} seems better to me then ! {query a=1}
          Hide
          Yonik Seeley added a comment -

          Attaching patch to change from <!foo> to

          {!foo}

          Seeing no other opinions, I'll commit after we get writable svn back.

          Show
          Yonik Seeley added a comment - Attaching patch to change from <!foo> to {!foo} Seeing no other opinions, I'll commit after we get writable svn back.
          Hide
          Yonik Seeley added a comment -

          curly brace patch committed.

          Show
          Yonik Seeley added a comment - curly brace patch committed.
          Hide
          Shalin Shekhar Mangar added a comment -

          Seems like this was committed. Should we close this issue?

          Show
          Shalin Shekhar Mangar added a comment - Seems like this was committed. Should we close this issue?
          Hide
          David Smiley added a comment -

          I intend on submitting a patch very soon which I think is related to this. There are two parts to it,
          1. Enhancing the DisjunctionMaxQueryParser to work on all the query variants such as wildcard, prefix, and fuzzy queries. This was not in Solr already because DisMax was only used for a very limited syntax that didn't use those features. In my opinion, this makes a more suitable base parser for general use because unlike the Lucene/Solr parser, this one supports multiple default fields whereas other ones (say your <!prefix> one for example, can't do dismax). The notion of a single default field is antiquated and a technical under-the-hood detail of Lucene that I think Solr should shield the user from by on-the-fly using a DisMax when multiple fields are used.

          2. Enhancing the DisMax QParser plugin to use a pluggable query string re-writer (via subclass extension) instead of the logic currently embedded within it (i.e. the escape nearly everything logic). Additionally, made this QParser have a notion of a "simple" syntax (the default) or non-simple in which case some of the logic in this QParser doesn't occur because it's irrelevant (phrase boosting and min-should-max in particular). As part of my work I significantly moved the code around to make it clearer and more extensible.

          Should I submit a new issue for this or add to this one?

          Show
          David Smiley added a comment - I intend on submitting a patch very soon which I think is related to this. There are two parts to it, 1. Enhancing the DisjunctionMaxQueryParser to work on all the query variants such as wildcard, prefix, and fuzzy queries. This was not in Solr already because DisMax was only used for a very limited syntax that didn't use those features. In my opinion, this makes a more suitable base parser for general use because unlike the Lucene/Solr parser, this one supports multiple default fields whereas other ones (say your <!prefix> one for example, can't do dismax). The notion of a single default field is antiquated and a technical under-the-hood detail of Lucene that I think Solr should shield the user from by on-the-fly using a DisMax when multiple fields are used. 2. Enhancing the DisMax QParser plugin to use a pluggable query string re-writer (via subclass extension) instead of the logic currently embedded within it (i.e. the escape nearly everything logic). Additionally, made this QParser have a notion of a "simple" syntax (the default) or non-simple in which case some of the logic in this QParser doesn't occur because it's irrelevant (phrase boosting and min-should-max in particular). As part of my work I significantly moved the code around to make it clearer and more extensible. Should I submit a new issue for this or add to this one?
          Hide
          Yonik Seeley added a comment -

          Should I submit a new issue for this or add to this one?

          Should definitely get it's own issue.

          Show
          Yonik Seeley added a comment - Should I submit a new issue for this or add to this one? Should definitely get it's own issue.
          Hide
          Yonik Seeley added a comment -

          resolving this issue.

          Show
          Yonik Seeley added a comment - resolving this issue.

            People

            • Assignee:
              Unassigned
              Reporter:
              Yonik Seeley
            • Votes:
              2 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development