Lucene - Core
  1. Lucene - Core
  2. LUCENE-3834

The tokenstream create by SmartChineseAnalyzer can't reset

    Details

    • Type: Bug Bug
    • Status: Open
    • Priority: Minor Minor
    • Resolution: Unresolved
    • Affects Version/s: 3.5
    • Fix Version/s: None
    • Component/s: modules/analysis
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      That is because the field input in class SentenceTokenizer isn't reset after we call the method reset().

      They are two input field,one is from Tokenizer and another is from TokenFilter,if we need to reset a tokenstream created by SmartChineseAnalyzer, both them need reset.This bug is because of the author forget reset input field in class SentenceTokenizer .

      class path : org.apache.lucene.analysis.cn.smart.SentenceTokenizer

      oringal code

      public final class SentenceTokenizer extends Tokenizer {
      ....
      @Override
      public void reset() throws IOException

      { super.reset(); tokenStart = tokenEnd = 0; }

      ...
      }

      this method should changes as follow

      public void reset() throws IOException

      { super.reset(); /*should reset input*/ if (input.markSupported()) input.reset(); tokenStart = tokenEnd = 0; }

        Activity

        There are no comments yet on this issue.

          People

          • Assignee:
            Unassigned
            Reporter:
            dingjin
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:

              Development