Details

    • Type: New Feature New Feature
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: search
    • Labels:
      None

      Description

      Solr currently has no support for Lucene's PayloadTermQuery, yet it has support for indexing payloads.

        Issue Links

          Activity

          Hide
          Erik Hatcher added a comment -

          This class adds a QParserPlugin to support creating PayloadTermQuery's.

          This can be registered in solrconfig.xml like this:

          <queryParser name="payload" class="org.apache.solr.search.PayloadTermQueryPlugin"/>

          A custom Similarity is needed to score payloads (not provided with this issue).

          Once everything is lined up right (payload indexed, similarity with scorePayload implemented), a query like this can be used:
          http://localhost:8983/solr/select?q=

          {!payload%20f=payloads%20func=avg}

          foo&debugQuery=true

          As can be seen with this explanation:
          1.4450715 = (MATCH) fieldWeight(payloads:foo in 0), product of:
          4.709331 = (MATCH) btq, product of:
          0.70710677 = tf(phraseFreq=0.5)
          6.66 = scorePayload(...)
          0.30685282 = idf(payloads: foo=1)
          1.0 = fieldNorm(field=payloads, doc=0)

          Show
          Erik Hatcher added a comment - This class adds a QParserPlugin to support creating PayloadTermQuery's. This can be registered in solrconfig.xml like this: <queryParser name="payload" class="org.apache.solr.search.PayloadTermQueryPlugin"/> A custom Similarity is needed to score payloads (not provided with this issue). Once everything is lined up right (payload indexed, similarity with scorePayload implemented), a query like this can be used: http://localhost:8983/solr/select?q= {!payload%20f=payloads%20func=avg} foo&debugQuery=true As can be seen with this explanation: 1.4450715 = (MATCH) fieldWeight(payloads:foo in 0), product of: 4.709331 = (MATCH) btq, product of: 0.70710677 = tf(phraseFreq=0.5) 6.66 = scorePayload(...) 0.30685282 = idf(payloads: foo=1) 1.0 = fieldNorm(field=payloads, doc=0)
          Hide
          Bill Au added a comment -

          Eric, have you started on this? I recently wrote a QParserPlugin that supports PayloadTermQuery. It is very bear-bone but could be a good starting point. I can attach my code here to get things started.

          Show
          Bill Au added a comment - Eric, have you started on this? I recently wrote a QParserPlugin that supports PayloadTermQuery. It is very bear-bone but could be a good starting point. I can attach my code here to get things started.
          Hide
          Bill Au added a comment -

          Never mind. I just saw you update. Your code looks good.

          Show
          Bill Au added a comment - Never mind. I just saw you update. Your code looks good.
          Hide
          Bill Au added a comment -

          Eric, do you think we should support default field and default operator in the QParser used?

          Show
          Bill Au added a comment - Eric, do you think we should support default field and default operator in the QParser used?
          Hide
          Yonik Seeley added a comment -

          Moving out of 1.4 since this is a new feature that isn't ready to commit.
          As written, it looks more like "rawpayload" or something since no analysis is done on the input.

          Show
          Yonik Seeley added a comment - Moving out of 1.4 since this is a new feature that isn't ready to commit. As written, it looks more like "rawpayload" or something since no analysis is done on the input.
          Hide
          Bill Au added a comment -

          I am +0 on including/excluding this from 1.4. FYI, Solr 1.4 already has a DelimitedPayloadTokenFilterFactory which uses the DelimitedPayloadTokenFIlter in Lucene. If we include this, I think we should also include a Similarity class for payload, either as part of this JIRA or a separate one.

          There is also a similar JIRA on query support:

          https://issues.apache.org/jira/browse/SOLR-1337

          Show
          Bill Au added a comment - I am +0 on including/excluding this from 1.4. FYI, Solr 1.4 already has a DelimitedPayloadTokenFilterFactory which uses the DelimitedPayloadTokenFIlter in Lucene. If we include this, I think we should also include a Similarity class for payload, either as part of this JIRA or a separate one. There is also a similar JIRA on query support: https://issues.apache.org/jira/browse/SOLR-1337
          Hide
          david added a comment -

          Hi,
          What if I want to do a boolean query?
          like: payoladField:steve OR NonPayloadField:George ?

          Won't the payload plugin be used for all the query parts?

          Show
          david added a comment - Hi, What if I want to do a boolean query? like: payoladField:steve OR NonPayloadField:George ? Won't the payload plugin be used for all the query parts?
          Hide
          Lance Norskog added a comment -

          Julien Noche posted last August that he had to create a new query parser variant of dismax. I cannot find an example of a query string in his post.

          Using Payloads with DisMaxQParser in SOLR

          Use cases for a payload-based query:

          • a raw byte stream
          • a serialized Java String
          • a number
          • a boolean value in the payload
          • "is there a payload?"
          • boosting a document if the search term has a payload
            • the payload is a number (packed float) created by

          Most of these can be encoded into a payload. But there are no matching decoders.
          There is no code that pulls the payload and uses the data.

          Show
          Lance Norskog added a comment - Julien Noche posted last August that he had to create a new query parser variant of dismax. I cannot find an example of a query string in his post. Using Payloads with DisMaxQParser in SOLR Use cases for a payload-based query: a raw byte stream a serialized Java String a number a boolean value in the payload "is there a payload?" boosting a document if the search term has a payload the payload is a number (packed float) created by Most of these can be encoded into a payload. But there are no matching decoders. There is no code that pulls the payload and uses the data.
          Hide
          Erik Hatcher added a comment -

          Is there interest in rejuvenating this to get some form of a SpanTermQuery support into Solr? I'll take a stab at updating this to do like the

          {!term}

          query parser to factor in the field type and any needed analysis. Anything else?

          Perhaps for the dismax+payloads situation Lance mentioned, which will be a different issue altogether, we make the SolrQueryParser implementation used by (e)dismax pluggable that it uses, so that there can be a span-aware one?

          Show
          Erik Hatcher added a comment - Is there interest in rejuvenating this to get some form of a SpanTermQuery support into Solr? I'll take a stab at updating this to do like the {!term} query parser to factor in the field type and any needed analysis. Anything else? Perhaps for the dismax+payloads situation Lance mentioned, which will be a different issue altogether, we make the SolrQueryParser implementation used by (e)dismax pluggable that it uses, so that there can be a span-aware one?
          Hide
          Roland Deck added a comment -

          Hi
          I tried the PayloadTermQueryPlugin today.
          To get the scores as mentioned above I had to change the code a little.

          Here is the relevant code fragment:

          @Override
          public QParser createParser(String qstr, SolrParams localParams, SolrParams params, SolrQueryRequest req) {
          return new QParser(qstr, localParams, params, req) {
          public Query parse() throws ParseException

          { //rdeck: hint: lets try to set includeSpanCore to true. => Yes it works! (after having re-indexed all documents)! return new PayloadTermQuery( new Term(localParams.get(QueryParsing.F), localParams.get(QueryParsing.V)), createPayloadFunction(localParams.get("func")), true); //was originally false instead of true }

          };
          }

          with includeSpanCore = false, I get score = payload value
          with includeSpanCore = true, the payload takes part on the score calculation

          I have some questions left:

          1) Why is the PayloadTermQuery limited to just one field? Or will this change?
          2) How can I mix up queries containing parts which are payload dependent and others which aren't?

          Show
          Roland Deck added a comment - Hi I tried the PayloadTermQueryPlugin today. To get the scores as mentioned above I had to change the code a little. Here is the relevant code fragment: @Override public QParser createParser(String qstr, SolrParams localParams, SolrParams params, SolrQueryRequest req) { return new QParser(qstr, localParams, params, req) { public Query parse() throws ParseException { //rdeck: hint: lets try to set includeSpanCore to true. => Yes it works! (after having re-indexed all documents)! return new PayloadTermQuery( new Term(localParams.get(QueryParsing.F), localParams.get(QueryParsing.V)), createPayloadFunction(localParams.get("func")), true); //was originally false instead of true } }; } with includeSpanCore = false, I get score = payload value with includeSpanCore = true, the payload takes part on the score calculation I have some questions left: 1) Why is the PayloadTermQuery limited to just one field? Or will this change? 2) How can I mix up queries containing parts which are payload dependent and others which aren't?
          Hide
          Otis Gospodnetic added a comment -

          Erik Hatcher - not sure if you are watching SOLR-1337, so I'll write the same comment/Q here:

          My impression was that Span queries and Payloads are kind of pase in Luceneland.... no?
          If yes, should we Won't Fix this?

          Show
          Otis Gospodnetic added a comment - Erik Hatcher - not sure if you are watching SOLR-1337 , so I'll write the same comment/Q here: My impression was that Span queries and Payloads are kind of pase in Luceneland.... no? If yes, should we Won't Fix this?
          Hide
          Grant Ingersoll added a comment -

          I would say it would be good to support payloads, unless there is a better solution.

          Show
          Grant Ingersoll added a comment - I would say it would be good to support payloads, unless there is a better solution.

            People

            • Assignee:
              Unassigned
              Reporter:
              Erik Hatcher
            • Votes:
              10 Vote for this issue
              Watchers:
              17 Start watching this issue

              Dates

              • Created:
                Updated:

                Development