Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-6010

Wrong highlighting while querying by date range with wild card in the end range

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Not A Problem
    • Affects Version/s: 4.0
    • Fix Version/s: None
    • Component/s: highlighter, query parsers
    • Environment:

      Description

      Solr is returning wrong highlights when I have a date range query with wild card in the end range. For example my query q is

      (porta)+activatedate:[* TO 2014-04-24T09:55:00Z]+expiredate:[2014-04-24T09:55:00Z TO *]
      

      In the above query activatedate, expiredate are date fields. Their definition in schema file is as follows

      <field name="activatedate" type="date" indexed="true" stored="false"
                 omitNorms="true"/>
      <field name="expiredate" type="date" indexed="true" stored="false"
                 omitNorms="true"/>
      

      In the query result I am getting wrong highlighting information. Only highlighting result is show below

       "highlighting": {
          "article:3605": {
            "title": [
              "The <em>creative</em> <em>headline</em> of this <em>story</em> <em>really</em> <em>says</em> it <em>all</em>"
            ],
            "summary": [
              "<em>Etiam</em> <em>porta</em> <em>sem</em> <em>malesuada</em> <em>magna</em> <em>mollis</em> <em>euismod</em> <em>aenean</em> <em>eu</em> <em>leo</em> <em>quam</em>. <em>Pellentesque</em> <em>ornare</em> <em>sem</em> <em>lacinia</em> <em>quam</em>."
            ]
          },
          "article:3604": {
            "title": [
              "The <em>creative</em> <em>headline</em> of this <em>story</em> <em>really</em> <em>says</em> it <em>all</em>"
            ],
            "summary": [
              "<em>Etiam</em> <em>porta</em> <em>sem</em> <em>malesuada</em> <em>magna</em> <em>mollis</em> <em>euismod</em> <em>aenean</em> <em>eu</em> <em>leo</em> <em>quam</em>. <em>Pellentesque</em> <em>ornare</em> <em>sem</em> <em>lacinia</em> <em>quam</em>.."
            ]
          }
      }
      

      It should highlight only story word but it is highlighting lot other words also. What I noticed that this happens only if I have a wildcard * in the end range. If I change the above query and set a fixed date in the end range instead of * then solr return correct highlights. Modified query is shown below -

      (porta)+activatedate:[* TO 2014-04-24T09:55:00Z]+expiredate:[2014-04-24T09:55:00Z TO 3014-04-24T09:55:00Z]
      

      I guess its a bug in SOLR. If I use filter query fq instead of normal query q then highlighting result is OK for both queries.

      Update
      If I use a specific date instead of * still it returns wrong highlights. This time it highlights numbers also. Say I am searching for the word math then it also highlights number with math. As for example if title of my article is Mathematics 1234 then it highlights 1234 also with math.

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              makhaer Mohammad Abul Khaer
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: