Lucene - Core
  1. Lucene - Core
  2. LUCENE-4863

Use FST to hold term in StemmerOverrideFilter

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 4.2
    • Fix Version/s: 4.3, 5.0
    • Component/s: modules/analysis
    • Labels:
      None
    • Lucene Fields:
      New, Patch Available

      Description

      follow-up from LUCENE-4857

      1. LUCENE-4863.patch
        22 kB
        Simon Willnauer
      2. LUCENE-4863.patch
        20 kB
        Simon Willnauer
      3. LUCENE-4863.patch
        20 kB
        Simon Willnauer

        Issue Links

          Activity

          Hide
          Simon Willnauer added a comment -

          here is a patch

          Show
          Simon Willnauer added a comment - here is a patch
          Hide
          Simon Willnauer added a comment -

          slightly updated patch with some cleanups

          Show
          Simon Willnauer added a comment - slightly updated patch with some cleanups
          Hide
          Robert Muir added a comment -

          A few nits:

          • This converts to UTF-8, but stores in a BYTE4 automaton. Is there a reason for BYTE4
          • javadoc typeo "Adds an input string and it's stemmer overwrite output to this builder."
          • should the ignoreCase be a property of the map itself rather than a separate param? synoymfilter has this same problem. If you didnt previously add to the map properly (e.g. lowercase) then this parameter won't work.
          Show
          Robert Muir added a comment - A few nits: This converts to UTF-8, but stores in a BYTE4 automaton. Is there a reason for BYTE4 javadoc typeo "Adds an input string and it's stemmer overwrite output to this builder." should the ignoreCase be a property of the map itself rather than a separate param? synoymfilter has this same problem. If you didnt previously add to the map properly (e.g. lowercase) then this parameter won't work.
          Hide
          Robert Muir added a comment -

          oops, i see the utf-8 is for the output. this is good, nevermind the first comment

          Show
          Robert Muir added a comment - oops, i see the utf-8 is for the output. this is good, nevermind the first comment
          Hide
          Simon Willnauer added a comment -

          updated patch, fixing the typo and moving the ignoreCase into the map impl. I will commit this soon. Thanks for looking at it robert!

          Show
          Simon Willnauer added a comment - updated patch, fixing the typo and moving the ignoreCase into the map impl. I will commit this soon. Thanks for looking at it robert!
          Hide
          Simon Willnauer added a comment -

          committed to 4.x (rev. 1460602) and trunk (rev. 1460580)

          Show
          Simon Willnauer added a comment - committed to 4.x (rev. 1460602) and trunk (rev. 1460580)
          Hide
          Uwe Schindler added a comment -

          Closed after release.

          Show
          Uwe Schindler added a comment - Closed after release.

            People

            • Assignee:
              Simon Willnauer
              Reporter:
              Simon Willnauer
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development