Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-1850

KeepWordFilter can be slow at query time if wordlist is large

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.4
    • Fix Version/s: 3.1, 4.0-ALPHA
    • Component/s: Schema and Analysis
    • Labels:
      None

      Description

      In the case when "Set<String> words" is large, constructing a KeepWordFilter at query time is very costly because of the construction (copy) of the set, e.g.:

      this.words = new CharArraySet(words, ignoreCase);

      This call does an addAll on the set, and is done for each query, and is the same work.

      Suggestion: overload the constructor and expose the CharArraySet, e.g.:

      public KeepWordFilter(TokenStream in, CharArraySet words )

      { super(in); this.words = words; this.termAtt = (TermAttribute)addAttribute(TermAttribute.class); }

      This allows the ability to have CharArraySet to be constructed once staticly for the application instead at query time.

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              john.wang@gmail.com John Wang
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: