Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-2564

wordlistloader is inefficient

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 3.5, 4.0-ALPHA
    • modules/analysis
    • None
    • New, Patch Available

    Description

      WordListLoader is basically used for loading up stopwords lists, stem dictionaries, etc.
      Unfortunately the api returns Set<String> and sometimes even HashSet<String> or HashMap<String,String>

      I think we should break it and return CharArraySets and CharArrayMaps (but leave the return value as generic Set,Map).

      If someone objects to breaking it in 3.1, then we can do this only in 4.0, but i think it would be good to fix it both places.
      The reason is that if someone does new FooAnalyzer() a lot (probably not uncommon) i think its doing a bunch of useless copying.

      I think we should slap @lucene.internal on this API too, since thats mostly how its being used.

      Attachments

        1. LUCENE-2564.patch
          51 kB
          Simon Willnauer
        2. LUCENE-2564.patch
          53 kB
          Simon Willnauer
        3. LUCENE-2564.patch
          54 kB
          Simon Willnauer

        Activity

          People

            rcmuir Robert Muir
            rcmuir Robert Muir
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: