Issue Details (XML | Word | Printable)

Key: LUCENE-494
Type: New Feature New Feature
Status: Closed Closed
Resolution: Fixed
Priority: Minor Minor
Assignee: Grant Ingersoll
Reporter: Mark Harwood
Votes: 1
Watchers: 0
Operations

If you were logged in you would be able to see more operations.
Lucene - Java

Analyzer for preventing overload of search service by queries with common terms in large indexes

Created: 08/Feb/06 08:12 AM   Updated: 11/Oct/08 12:49 PM
Return to search
Component/s: Analysis
Affects Version/s: 2.4
Fix Version/s: 2.4

Time Tracking:
Not Specified

File Attachments:
  Size
Java Source File Licensed for inclusion in ASF works QueryAutoStopWordAnalyzer.java 2006-02-08 08:13 AM Mark Harwood 8 kB
Java Source File Licensed for inclusion in ASF works QueryAutoStopWordAnalyzerTest.java 2006-02-08 08:13 AM Mark Harwood 6 kB

Resolution Date: 07/Feb/08 02:13 PM


 Description  « Hide
An analyzer used primarily at query time to wrap another analyzer and provide a layer of protection
which prevents very common words from being passed into queries. For very large indexes the cost
of reading TermDocs for a very common word can be high. This analyzer was created after experience with
a 38 million doc index which had a term in around 50% of docs and was causing TermQueries for
this term to take 2 seconds.

Use the various "addStopWords" methods in this class to automate the identification and addition of
stop words found in an already existing index.



 All   Comments   Work Log   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Mark Harwood made changes - 08/Feb/06 08:13 AM
Field Original Value New Value
Attachment QueryAutoStopWordAnalyzer.java [ 12322726 ]
Mark Harwood made changes - 08/Feb/06 08:13 AM
Attachment QueryAutoStopWordAnalyzerTest.java [ 12322727 ]
Grant Ingersoll made changes - 10/Jan/08 08:32 PM
Assignee Grant Ingersoll [ gsingers ]
Grant Ingersoll made changes - 12/Jan/08 11:11 PM
Affects Version/s 2.4 [ 12312681 ]
Grant Ingersoll made changes - 07/Feb/08 02:13 PM
Status Open [ 1 ] Resolved [ 5 ]
Resolution Fixed [ 1 ]
Fix Version/s 2.4 [ 12312681 ]
Michael McCandless made changes - 11/Oct/08 12:49 PM
Status Resolved [ 5 ] Closed [ 6 ]