Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-1824

FastVectorHighlighter truncates words at beginning and end of fragments

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 3.5, 4.0-ALPHA
    • Component/s: modules/highlighter
    • Labels:
      None
    • Environment:

      any

    • Lucene Fields:
      New, Patch Available

      Description

      FastVectorHighlighter does not take word boundaries into consideration when building fragments, so that in most cases the first and last word of a fragment are truncated. This makes the highlights less legible than they should be. I will attach a patch to BaseFragmentBuilder that resolves this by expanding the start and end boundaries of the fragment to the first whitespace character on either side of the fragment, or the beginning or end of the source text, whichever comes first. This significantly improves legibility, at the cost of returning a slightly larger number of characters than specified for the fragment size.

        Attachments

        1. LUCENE-1824.patch
          22 kB
          Koji Sekiguchi
        2. LUCENE-1824.patch
          21 kB
          Koji Sekiguchi
        3. LUCENE-1824.patch
          22 kB
          Koji Sekiguchi
        4. LUCENE-1824.patch
          17 kB
          Koji Sekiguchi
        5. LUCENE-1824.patch
          12 kB
          Koji Sekiguchi
        6. LUCENE-1824.patch
          9 kB
          Alex Vigdor

          Issue Links

            Activity

              People

              • Assignee:
                koji Koji Sekiguchi
                Reporter:
                alexvigdor Alex Vigdor
              • Votes:
                6 Vote for this issue
                Watchers:
                7 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: