Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-2910

Highlighter does not correctly highlight the phrase around 50th term

    Details

    • Type: Bug
    • Status: Open
    • Priority: Trivial
    • Resolution: Unresolved
    • Affects Version/s: 2.9.4
    • Fix Version/s: None
    • Component/s: modules/highlighter
    • Labels:
      None
    • Lucene Fields:
      New, Patch Available

      Description

      When you use the Highlighter combined with N-Gram tokenizers such as CJKTokenizer and try to highlight the phrase that appears around 50th term in the field, the highlighted phrase is shorter than expected.

      e.g. Highlighting "fooo" in the following text with bigram tokenizer:
      "0---------1---------2---------3---------4---------fooo---"
      
      Expected: "0---------1---------2---------3---------4---------<B>fooo</B>---"
      Actual: "0---------1---------2---------3---------4---------f<B>ooo</B>---"
      

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              shinya Shinya Kasatani
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated: