Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-5278

MockTokenizer throws away the character right after a token even if it is a valid start to a new token

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Trivial
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.6, 6.0
    • Component/s: None
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      MockTokenizer throws away the character right after a token even if it is a valid start to a new token. You won't see this unless you build a tokenizer that can recognize every character like with new RegExp(".") or RegExp("...").

      Changing this behaviour seems to break a number of tests.

        Attachments

        1. LUCENE-5278.patch
          9 kB
          Robert Muir
        2. LUCENE-5278.patch
          6 kB
          Robert Muir
        3. LUCENE-5278.patch
          5 kB
          Nik Everett

          Activity

            People

            • Assignee:
              rcmuir Robert Muir
              Reporter:
              nik9000@gmail.com Nik Everett
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: