Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-4089

FastVectorHighlighter produces superflouos snippets for alternateField

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 4.0
    • 4.9, 6.0
    • highlighter
    • None

    Description

      Highlighter produces multiple snippets for the alternateField when using FVH only. This only becomes obvious when using using a glob for the hl.fl parameter. It's easy to reproduce by slighly modifying the example schema.

      Add the following fields to the schema. The more name_* fields you add, the more snippets are produced, one for each field matching the glob. The problem is only visible if alternateField is specified to any existing field.

      <field name="name_a" type="text_general" indexed="true" stored="true"/>
      <field name="name_b" type="text_general" indexed="true" stored="true"/>
      <field name="name_c" type="text_general" indexed="true" stored="true"/>
      <copyField source="name" dest="name_a"/>
      <copyField source="name" dest="name_b"/>
      <copyField source="name" dest="name_c"/>
      

      Index the example data and run the query

      http://localhost:8983/solr/select?q=id:6H500F0&hl=true&hl.fl=name*&hl.alternateField=id&hl.useFastVectorHighlighter=true
      

      This will produce one snippet for each field that didn't match anyway instead of emitting only the ID field as alternate.

      <response>
      
      <lst name="responseHeader">
        <int name="status">0</int>
        <int name="QTime">5</int>
        <lst name="params">
          <str name="hl.useFastVectorHighlighter">true</str>
          <str name="indent">true</str>
          <str name="q">id:6H500F0</str>
          <str name="hl.alternateField">id</str>
          <str name="hl.fl">name*</str>
          <str name="hl">true</str>
        </lst>
      </lst>
      <result name="response" numFound="1" start="0">
        <doc>
          <str name="id">6H500F0</str>
          <str name="name">Maxtor DiamondMax 11 - hard drive - 500 GB - SATA-300</str>
          <str name="name_a">Maxtor DiamondMax 11 - hard drive - 500 GB - SATA-300</str>
          <str name="name_b">Maxtor DiamondMax 11 - hard drive - 500 GB - SATA-300</str>
          <str name="name_c">Maxtor DiamondMax 11 - hard drive - 500 GB - SATA-300</str>
          <str name="name_d">Maxtor DiamondMax 11 - hard drive - 500 GB - SATA-300</str>
          <str name="manu">Maxtor Corp.</str>
          <str name="manu_id_s">maxtor</str>
          <arr name="cat">
            <str>electronics</str>
            <str>hard drive</str>
          </arr>
          <arr name="features">
            <str>SATA 3.0Gb/s, NCQ</str>
            <str>8.5ms seek</str>
            <str>16MB cache</str>
          </arr>
          <float name="price">350.0</float>
          <str name="price_c">350,USD</str>
          <int name="popularity">6</int>
          <bool name="inStock">true</bool>
          <str name="store">45.17614,-93.87341</str>
          <date name="manufacturedate_dt">2006-02-13T15:26:37Z</date>
          <long name="_version_">1418796316951052288</long></doc>
      </result>
      <lst name="highlighting">
        <lst name="6H500F0">
          <arr name="name">
            <str>6H500F0</str>
          </arr>
          <arr name="name_c">
            <str>6H500F0</str>
          </arr>
          <arr name="name_b">
            <str>6H500F0</str>
          </arr>
          <arr name="name_a">
            <str>6H500F0</str>
          </arr>
          <arr name="name_d">
            <str>6H500F0</str>
          </arr>
        </lst>
      </lst>
      </response>
      

      Attachments

        1. SOLR-4089-trunk.patch
          1.0 kB
          Markus Jelsma

        Activity

          People

            Unassigned Unassigned
            markus17 Markus Jelsma
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: