Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-3110

Search result comes up with truncated words at the start of highlighted fragment

Agile BoardAttach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    Description

      It is being observed that words are getting truncated at the start of Highlighter fragment displayed.
      Following boundary scanner settings are introduced inside in the solrconfig.xml file

      <str name="hl.bs.chars">.,!? &#9;&#10;&#13;</str>

      If I change the settings to

      <str name="hl.bs.chars">.,!?</str>

      then it is seen that this issue goes away but another issues comes up where the highlighted search fragment does not start from the beginning of the sentence.

      Below is the complete list of setting we are using for boundary scanner.

      <boundaryScanner name="simple" class="solr.highlight.SimpleBoundaryScanner" default="true">
      <lst name="defaults">
      <str name="hl.bs.maxScan">200</str>
      <str name="hl.bs.chars">.,!? &#9;&#10;&#13;</str>
      </lst>
      </boundaryScanner>

      <boundaryScanner name="breakIterator" class="solr.highlight.BreakIteratorBoundaryScanner">
      <lst name="defaults">
      <str name="hl.bs.type">SENTENCE</str>
      <str name="hl.bs.language">en</str>
      <str name="hl.bs.country">US</str>
      </lst>
      </boundaryScanner>

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned
            shyamb Shyam Bhaskaran
            Votes:
            1 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment