Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-4993

BeiderMorseFilter inserts tokens with positionIncrement=0, but ignores all custom attributes except OffsetAttribute

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 4.3
    • Fix Version/s: 4.3.1, 6.0
    • Component/s: modules/analysis
    • Labels:
      None
    • Lucene Fields:
      New, Patch Available

      Description

      BeiderMorseFilter inserts sometimes additional phonetic tokens for the same source token. Currently it calls clearAttributes before doing this and sets the new token's term, positionIncrement=0 and the original offset.

      This leads to problems if the TokenStream contains other attributes inserted before (like KeywordAttribute, FlagsAttribute,...). Those are all reverted to defaults for the inserted tokens.

      The TokenFilter should remove the special case done for preserving offsets and instead to captureState() and restoreState().

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                thetaphi Uwe Schindler
                Reporter:
                thetaphi Uwe Schindler
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: