Lucene - Core
  1. Lucene - Core
  2. LUCENE-4279

Regenerate Snowball code so its not so heavy

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.0-BETA, 6.0
    • Component/s: None
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      Spinoff from LUCENE-3841 (and several threads on the list)

      Currently each SnowballStemmer is pretty heavy since each instance also contains a bunch of Among objects (part of the stemming rules).

      This normally shouldnt be a problem, except it seems challenging
      for tomcat users to tune their threadpools (basically they are creating
      lots of tokenstreams, so lots of SnowballStemmers)

      Newer snowball just makes these static, and its easy enough to just
      regenerate so these aren't so heavy, it doesnt fix the real problem but it also doesn't hurt.

      1. LUCENE-4279.patch
        1.58 MB
        Robert Muir

        Activity

        Hide
        Robert Muir added a comment -

        patch: no need to regenerate the ones from the website that arent in the package as they already work this way (Irish/Basque/Catalan/Armenian)

        I also added a thread safety test (just checkRandomData against all the languages).

        Show
        Robert Muir added a comment - patch: no need to regenerate the ones from the website that arent in the package as they already work this way (Irish/Basque/Catalan/Armenian) I also added a thread safety test (just checkRandomData against all the languages).

          People

          • Assignee:
            Unassigned
            Reporter:
            Robert Muir
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development