Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-8730

Ensure WordDelimiterGraphFilter always emits its original token first

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 8.1
    • Component/s: None
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      WordDelimiterFilter and WordDelimiterGraphFilter behave almost identically outside setting position length; the only difference being that WDGF can sometimes emit its original token as the second output token rather than the first. We should change this to conform to the behaviour of the older filter - this will make it much easier to remove WDF entirely and cut over tests that use it incidentally.

        Attachments

        1. LUCENE-8730.patch
          5 kB
          Alan Woodward
        2. LUCENE-8730.patch
          4 kB
          Alan Woodward

          Issue Links

            Activity

              People

              • Assignee:
                romseygeek Alan Woodward
                Reporter:
                romseygeek Alan Woodward
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: