Issue Details (XML | Word | Printable)

Key: LUCENE-967
Type: Improvement Improvement
Status: Closed Closed
Resolution: Fixed
Priority: Minor Minor
Assignee: Michael McCandless
Reporter: Michael McCandless
Votes: 0
Watchers: 1
Operations

If you were logged in you would be able to see more operations.
Lucene - Java

Add "tokenize documents only" task to contrib/benchmark

Created: 26/Jul/07 06:08 PM   Updated: 25/Jan/08 03:24 AM
Return to search
Component/s: contrib/benchmark
Affects Version/s: 2.3
Fix Version/s: 2.3

Time Tracking:
Not Specified

File Attachments:
  Size
Text File Licensed for inclusion in ASF works LUCENE-967.patch 2007-07-26 06:09 PM Michael McCandless 11 kB
Text File Licensed for inclusion in ASF works LUCENE-967.take2.patch 2007-07-29 12:59 AM Michael McCandless 12 kB
Text File Licensed for inclusion in ASF works LUCENE-967.take3.patch 2007-08-01 12:16 PM Michael McCandless 14 kB

Lucene Fields: Patch Available, New
Resolution Date: 01/Aug/07 06:55 PM


 Description  « Hide
I've been looking at performance improvements to tokenization by
re-using Tokens, and to help benchmark my changes I've added a new
task called ReadTokens that just steps through all fields in a
document, gets a TokenStream, and reads all the tokens out of it.

EG this alg just reads all Tokens for all docs in Reuters collection:

doc.maker=org.apache.lucene.benchmark.byTask.feeds.ReutersDocMaker
doc.maker.forever=false
{ReadTokens > : *



 All   Comments   Work Log   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Michael McCandless made changes - 26/Jul/07 06:09 PM
Field Original Value New Value
Attachment LUCENE-967.patch [ 12362636 ]
Michael McCandless made changes - 26/Jul/07 06:10 PM
Status Open [ 1 ] In Progress [ 3 ]
Michael McCandless made changes - 26/Jul/07 06:10 PM
Lucene Fields [New] [New, Patch Available]
Michael McCandless made changes - 29/Jul/07 12:59 AM
Attachment LUCENE-967.take2.patch [ 12362728 ]
Michael McCandless made changes - 01/Aug/07 12:16 PM
Attachment LUCENE-967.take3.patch [ 12362968 ]
Michael McCandless made changes - 01/Aug/07 06:55 PM
Lucene Fields [Patch Available, New] [New, Patch Available]
Status In Progress [ 3 ] Resolved [ 5 ]
Resolution Fixed [ 1 ]
Michael Busch made changes - 25/Jan/08 03:24 AM
Status Resolved [ 5 ] Closed [ 6 ]