Lucene - Core
  1. Lucene - Core
  2. LUCENE-2910

Highlighter does not correctly highlight the phrase around 50th term

    Details

    • Type: Bug Bug
    • Status: Open
    • Priority: Trivial Trivial
    • Resolution: Unresolved
    • Affects Version/s: 2.9.4
    • Fix Version/s: None
    • Component/s: modules/highlighter
    • Labels:
      None
    • Lucene Fields:
      New, Patch Available

      Description

      When you use the Highlighter combined with N-Gram tokenizers such as CJKTokenizer and try to highlight the phrase that appears around 50th term in the field, the highlighted phrase is shorter than expected.

      e.g. Highlighting "fooo" in the following text with bigram tokenizer:
      "0---------1---------2---------3---------4---------fooo---"
      
      Expected: "0---------1---------2---------3---------4---------<B>fooo</B>---"
      Actual: "0---------1---------2---------3---------4---------f<B>ooo</B>---"
      
      1. HighlighterFix.patch
        3 kB
        Shinya Kasatani

        Activity

        Hide
        Shinya Kasatani added a comment -

        A test case that describes the problem, along with a fix.

        Show
        Shinya Kasatani added a comment - A test case that describes the problem, along with a fix.

          People

          • Assignee:
            Unassigned
            Reporter:
            Shinya Kasatani
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:

              Development