Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-2219

improve BaseTokenStreamTestCase to test end()

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • New, Patch Available

    Description

      If offsetAtt/end() is not implemented correctly, then there can be problems with highlighting: see LUCENE-2207 for an example with CJKTokenizer.

      In my opinion you currently have to write too much code to test this.

      This patch does the following:

      • adds optional Integer finalOffset (can be null for no checking) to assertTokenStreamContents
      • in assertAnalyzesTo, automatically fill this with the String length()

      In my opinion this is correct, for assertTokenStreamContents the behavior should be optional, it may not even have a Tokenizer. If you are using assertTokenStreamContents with a Tokenizer then simply provide the extra expected value to check it.

      for assertAnalyzesTo then it is implied there is a tokenizer so it should be checked.

      the tests pass for core but there are failures in contrib even besides CJKTokenizer (apply Koji's patch from LUCENE-2207, it is correct). Specifically ChineseTokenizer has a similar problem.

      Attachments

        1. LUCENE-2219.patch
          20 kB
          Robert Muir
        2. LUCENE-2219.patch
          14 kB
          Robert Muir
        3. LUCENE-2219.patch
          4 kB
          Robert Muir

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            rcmuir Robert Muir
            rcmuir Robert Muir
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment