Lucene - Core
  1. Lucene - Core
  2. LUCENE-4857

StemmerOverrideFilter should not copy the stem override dictionary in it's ctor.

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 4.0, 4.1, 4.2
    • Fix Version/s: 4.2.1, 6.0
    • Component/s: modules/analysis
    • Labels:
      None
    • Lucene Fields:
      New, Patch Available

      Description

      Currently the dictionary is cloned each time the token filter is created which is a serious bottleneck if you use this filter with large dictionaries and can also lead to OOMs if lots of those filters sit in ThreadLocals and new threads are added etc. I think cloning the map should be done in the analyzer (which all of our analyzers do btw. but this is the only TF that does that) no need to really copy that map.

      1. LUCENE-4857.patch
        2 kB
        Simon Willnauer

        Issue Links

          Activity

          Hide
          Simon Willnauer added a comment -

          here is a patch

          Show
          Simon Willnauer added a comment - here is a patch
          Hide
          Robert Muir added a comment -

          +1

          I think we should do this for 4.2.1, but change this guy to use FST for 4.3

          So if someone has a big dictionary, it won't eat up tons of RAM, and also enforces immutability.

          It means its factory must do a little more work but I think thats ok.

          Show
          Robert Muir added a comment - +1 I think we should do this for 4.2.1, but change this guy to use FST for 4.3 So if someone has a big dictionary, it won't eat up tons of RAM, and also enforces immutability. It means its factory must do a little more work but I think thats ok.
          Hide
          Commit Tag Bot added a comment -

          [trunk commit] Simon Willnauer
          http://svn.apache.org/viewvc?view=revision&revision=1458848

          LUCENE-4857: Don't unnecessarily copy stem override map in StemmerOverrideFilter

          Show
          Commit Tag Bot added a comment - [trunk commit] Simon Willnauer http://svn.apache.org/viewvc?view=revision&revision=1458848 LUCENE-4857 : Don't unnecessarily copy stem override map in StemmerOverrideFilter
          Hide
          Commit Tag Bot added a comment -

          [trunk commit] Simon Willnauer
          http://svn.apache.org/viewvc?view=revision&revision=1458857

          LUCENE-4857: Don't unnecessarily copy stem override map in StemmerOverrideFilter

          Show
          Commit Tag Bot added a comment - [trunk commit] Simon Willnauer http://svn.apache.org/viewvc?view=revision&revision=1458857 LUCENE-4857 : Don't unnecessarily copy stem override map in StemmerOverrideFilter
          Hide
          Commit Tag Bot added a comment -

          [branch_4x commit] Simon Willnauer
          http://svn.apache.org/viewvc?view=revision&revision=1458863

          LUCENE-4857: Don't unnecessarily copy stem override map in StemmerOverrideFilter

          Show
          Commit Tag Bot added a comment - [branch_4x commit] Simon Willnauer http://svn.apache.org/viewvc?view=revision&revision=1458863 LUCENE-4857 : Don't unnecessarily copy stem override map in StemmerOverrideFilter

            People

            • Assignee:
              Simon Willnauer
              Reporter:
              Simon Willnauer
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development