Issue Details (XML | Word | Printable)

Key: LUCENE-400
Type: Improvement Improvement
Status: Closed Closed
Resolution: Fixed
Priority: Minor Minor
Assignee: Grant Ingersoll
Reporter: Sebastian Kirsch
Votes: 5
Watchers: 3
Operations

If you were logged in you would be able to see more operations.
Lucene - Java

NGramFilter -- construct n-grams from a TokenStream

Created: 22/Jun/05 06:08 AM   Updated: 11/Oct/08 12:49 PM
Return to search
Component/s: Analysis
Affects Version/s: unspecified
Fix Version/s: 2.4

Time Tracking:
Not Specified

File Attachments:
  Size
Text File Licensed for inclusion in ASF works LUCENE-400.patch 2008-01-14 04:15 AM Steven Rowe 26 kB
Java Source File NGramAnalyzerWrapper.java 2005-06-22 06:10 AM Sebastian Kirsch 2 kB
Java Source File NGramAnalyzerWrapperTest.java 2005-07-29 09:56 PM Sebastian Kirsch 5 kB
Java Source File NGramFilter.java 2005-06-22 06:09 AM Sebastian Kirsch 6 kB
Java Source File NGramFilterTest.java 2005-06-22 06:12 AM Sebastian Kirsch 6 kB
Environment:
Operating System: All
Platform: All

Bugzilla Id: 35456
Lucene Fields: Patch Available
Resolution Date: 29/Mar/08 09:09 PM


 Description  « Hide
This filter constructs n-grams (token combinations up to a fixed size, sometimes
called "shingles") from a token stream.

The filter sets start offsets, end offsets and position increments, so
highlighting and phrase queries should work.

Position increments > 1 in the input stream are replaced by filler tokens
(tokens with termText "_" and endOffset - startOffset = 0) in the output
n-grams. (Position increments > 1 in the input stream are usually caused by
removing some tokens, eg. stopwords, from a stream.)

The filter uses CircularFifoBuffer and UnboundedFifoBuffer from Apache
Commons-Collections.

Filter, test case and an analyzer are attached.



 All   Comments   Work Log   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Jeff Turner made changes - 03/Sep/05 03:29 PM
Field Original Value New Value
issue.field.bugzillaimportkey 35456 12314550
Grant Ingersoll made changes - 12/Jan/08 10:56 PM
Resolution Won't Fix [ 2 ]
Status Open [ 1 ] Closed [ 6 ]
Assignee Lucene Developers [ java-dev@lucene.apache.org ]
Grant Ingersoll made changes - 12/Jan/08 11:13 PM
Link This issue duplicates LUCENE-759 [ LUCENE-759 ]
Grant Ingersoll made changes - 13/Jan/08 01:33 PM
Status Closed [ 6 ] Reopened [ 4 ]
Resolution Won't Fix [ 2 ]
Steven Rowe made changes - 14/Jan/08 04:15 AM
Attachment LUCENE-400.patch [ 12373074 ]
Grant Ingersoll made changes - 14/Jan/08 12:29 PM
Lucene Fields [Patch Available]
Fix Version/s 2.4 [ 12312681 ]
Steven Rowe made changes - 14/Jan/08 06:38 PM
Link This issue duplicates LUCENE-759 [ LUCENE-759 ]
Otis Gospodnetic made changes - 14/Jan/08 06:55 PM
Assignee Otis Gospodnetic [ otis ]
Otis Gospodnetic made changes - 25/Mar/08 10:39 PM
Assignee Otis Gospodnetic [ otis ] Grant Ingersoll [ gsingers ]
Grant Ingersoll made changes - 29/Mar/08 09:09 PM
Resolution Fixed [ 1 ]
Status Reopened [ 4 ] Resolved [ 5 ]
Michael McCandless made changes - 11/Oct/08 12:49 PM
Status Resolved [ 5 ] Closed [ 6 ]