Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-1824

FastVectorHighlighter truncates words at beginning and end of fragments

Details

    • Improvement
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • None
    • 3.5, 4.0-ALPHA
    • modules/highlighter
    • None
    • any

    • New, Patch Available

    Description

      FastVectorHighlighter does not take word boundaries into consideration when building fragments, so that in most cases the first and last word of a fragment are truncated. This makes the highlights less legible than they should be. I will attach a patch to BaseFragmentBuilder that resolves this by expanding the start and end boundaries of the fragment to the first whitespace character on either side of the fragment, or the beginning or end of the source text, whichever comes first. This significantly improves legibility, at the cost of returning a slightly larger number of characters than specified for the fragment size.

      Attachments

        1. LUCENE-1824.patch
          22 kB
          Koji Sekiguchi
        2. LUCENE-1824.patch
          21 kB
          Koji Sekiguchi
        3. LUCENE-1824.patch
          22 kB
          Koji Sekiguchi
        4. LUCENE-1824.patch
          17 kB
          Koji Sekiguchi
        5. LUCENE-1824.patch
          12 kB
          Koji Sekiguchi
        6. LUCENE-1824.patch
          9 kB
          Alex Vigdor

        Issue Links

          Activity

            People

              koji Koji Sekiguchi
              alexvigdor Alex Vigdor
              Votes:
              6 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: