Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-1850

KeepWordFilter can be slow at query time if wordlist is large

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 1.4
    • 3.1, 4.0-ALPHA
    • Schema and Analysis
    • None

    Description

      In the case when "Set<String> words" is large, constructing a KeepWordFilter at query time is very costly because of the construction (copy) of the set, e.g.:

      this.words = new CharArraySet(words, ignoreCase);

      This call does an addAll on the set, and is done for each query, and is the same work.

      Suggestion: overload the constructor and expose the CharArraySet, e.g.:

      public KeepWordFilter(TokenStream in, CharArraySet words )

      { super(in); this.words = words; this.termAtt = (TermAttribute)addAttribute(TermAttribute.class); }

      This allows the ability to have CharArraySet to be constructed once staticly for the application instead at query time.

      Attachments

        Activity

          People

            Unassigned Unassigned
            john.wang@gmail.com John Wang
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: