Lucene - Core
  1. Lucene - Core
  2. LUCENE-3834

The tokenstream create by SmartChineseAnalyzer can't reset

    Details

    • Type: Bug Bug
    • Status: Open
    • Priority: Minor Minor
    • Resolution: Unresolved
    • Affects Version/s: 3.5
    • Fix Version/s: None
    • Component/s: modules/analysis
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      That is because the field input in class SentenceTokenizer isn't reset after we call the method reset().

      They are two input field,one is from Tokenizer and another is from TokenFilter,if we need to reset a tokenstream created by SmartChineseAnalyzer, both them need reset.This bug is because of the author forget reset input field in class SentenceTokenizer .

      class path : org.apache.lucene.analysis.cn.smart.SentenceTokenizer

      oringal code

      public final class SentenceTokenizer extends Tokenizer {
      ....
      @Override
      public void reset() throws IOException

      { super.reset(); tokenStart = tokenEnd = 0; }

      ...
      }

      this method should changes as follow

      public void reset() throws IOException

      { super.reset(); /*should reset input*/ if (input.markSupported()) input.reset(); tokenStart = tokenEnd = 0; }

        Activity

        dingjin made changes -
        Description That is because the field input in class SentenceTokenizer isn't reset after we call the method reset().

        They are two input field,one is from Tokenizer and another is from TokenFilter,if we need to reset a tokenstream created by SmartChineseAnalyzer, both them need reset.This bug is because of the author forget reset input field in class SentenceTokenizer .

        class path : org.apache.lucene.analysis.cn.smart.SentenceTokenizer

        oringal code

        public final class SentenceTokenizer extends Tokenizer {
          ....
          @Override
          public void reset() throws IOException {
            super.reset();
            tokenStart = tokenEnd = 0;
          }

         ...
        }

        this method should changes as follow

         
          public void reset() throws IOException {
            super.reset();
            /*should reset input*/
            input.reset();
            tokenStart = tokenEnd = 0;
          }



        That is because the field input in class SentenceTokenizer isn't reset after we call the method reset().

        They are two input field,one is from Tokenizer and another is from TokenFilter,if we need to reset a tokenstream created by SmartChineseAnalyzer, both them need reset.This bug is because of the author forget reset input field in class SentenceTokenizer .

        class path : org.apache.lucene.analysis.cn.smart.SentenceTokenizer

        oringal code

        public final class SentenceTokenizer extends Tokenizer {
          ....
          @Override
          public void reset() throws IOException {
            super.reset();
            tokenStart = tokenEnd = 0;
          }

         ...
        }

        this method should changes as follow

         
          public void reset() throws IOException {
            super.reset();
            /*should reset input*/
            if (input.markSupported())
                input.reset();
            tokenStart = tokenEnd = 0;
          }


        Priority Major [ 3 ] Minor [ 4 ]
        dingjin made changes -
        Field Original Value New Value
        Description That is because the field input in class SentenceTokenizer isn't reset after we call the method reset().

        They are two input field,one is from Tokenizer and another is from TokenFilter,if we need to reset a tokenstream created by SmartChineseAnalyzer, both them need reset.This bug is because of the author forget reset input field in class SentenceTokenizer .

        class path : org.apache.lucene.analysis.cn.smart.SentenceTokenizer

        oringal code

        public final class SentenceTokenizer extends Tokenizer {
          ....
          @Override
          public void reset() throws IOException {
            super.reset();
            tokenStart = tokenEnd = 0;
          }

         ...
        }

        this method should changes as follow

         
          public void reset() throws IOException {
            super.reset();
            //should reset input
            input.reset();
            tokenStart = tokenEnd = 0;
          }



        That is because the field input in class SentenceTokenizer isn't reset after we call the method reset().

        They are two input field,one is from Tokenizer and another is from TokenFilter,if we need to reset a tokenstream created by SmartChineseAnalyzer, both them need reset.This bug is because of the author forget reset input field in class SentenceTokenizer .

        class path : org.apache.lucene.analysis.cn.smart.SentenceTokenizer

        oringal code

        public final class SentenceTokenizer extends Tokenizer {
          ....
          @Override
          public void reset() throws IOException {
            super.reset();
            tokenStart = tokenEnd = 0;
          }

         ...
        }

        this method should changes as follow

         
          public void reset() throws IOException {
            super.reset();
            /*should reset input*/
            input.reset();
            tokenStart = tokenEnd = 0;
          }



        dingjin created issue -

          People

          • Assignee:
            Unassigned
            Reporter:
            dingjin
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:

              Development