Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-6602

dismax query does not match with additional field in qf

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Duplicate
    • 4.10
    • None
    • query parsers
    • None

    Description

      A query using the Solr dismax query parser does not match anymore after I've added another field to the qf parameter. I'd expect that an additional field in the qf parameter would not lead to fewer matches.

      Test setup
      A document with rather strange content in a field "name_tokenized" of type "text_general":

      abc_<iframe src='loadLocale.js' onload='javascript:document.XSSed="name"' width=0 height=0>
      

      can be found when using the following dismax query with qf set to field "name_tokenized" only:

      http://localhost:44080/solr/studio/editor?deftype=dismax&q=abc_%3Ciframe+src%3D%27loadLocale.js%27+onload%3D%27javascript%3Adocument.XSSed%3D%22name%22%27&debug=true&echoParams=all&qf=name_tokenized^2
      

      When submitting exactly the same query but with an additional field "feederstate" of type "string" in the qf parameter, I don't get any results.

      http://localhost:44080/solr/studio/editor?deftype=dismax&q=abc_%3Ciframe+src%3D%27loadLocale.js%27+onload%3D%27javascript%3Adocument.XSSed%3D%22name%22%27&debug=true&echoParams=all&qf=name_tokenized^2%20feederstate
      

      The decoded value of q is:

      abc_<iframe src='loadLocale.js' onload='javascript:document.XSSed="name"'

      and it seems the trailing single-quote causes problems here. (In fact, I can find the document when I remove the last char)

      The parsed query for the latter case is

      (
        +((
          DisjunctionMaxQuery((feederstate:abc_<iframe | ((name_tokenized:abc_ name_tokenized:iframe)^2.0))~0.1)
          DisjunctionMaxQuery((feederstate:src='loadLocale.js' | ((name_tokenized:src name_tokenized:loadlocale.js)^2.0))~0.1)
          DisjunctionMaxQuery((feederstate:onload='javascript:document.XSSed= | ((name_tokenized:onload name_tokenized:javascript:document.xssed)^2.0))~0.1)
          DisjunctionMaxQuery((feederstate:name | name_tokenized:name^2.0)~0.1)
          DisjunctionMaxQuery((feederstate:')~0.1)
        )~5)
      
        DisjunctionMaxQuery((textbody:"abc_ iframe src loadlocale.js onload javascript:document.xssed name" | name_tokenized:"abc_ iframe src loadlocale.js onload javascript:document.xssed name"^2.0)~0.1)
      )/no_coord
      

      I've configured the called search handler with <str name="mm">100%</str> so that all of the 5 dismax queries at the top must match. But this one does not match: DisjunctionMaxQuery((feederstate:')~0.1)

      (All mentioned field types are taken from the example schema.xml.)

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              ahubold Andreas Hubold
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: