[SOLR-477] AnalysisRequestHandler - ASF JIRA

XML

Word

Printable

JSON

Details

Type: New Feature
Status: Closed
Priority: Minor
Resolution: Fixed
Affects Version/s: None
Fix Version/s: None
Component/s: None
Labels:
None

Description

Being able to programmatically access tokenization information can be quite useful not only in Solr, but in other NLP applications where token vectors are necessary.

The patch to follow creates an AnalysisRequestHandler which processes a document through the analysis process and returns a response filled with tokens, their offsets, position inc., type and value.

Patch also adds some character array processing to Xml and adds Token handling to XMLWriter.

I only implemented Xml output, as I don't know JSON or the other types. If someone else is so motivated, they can add those.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

SOLR-477.patch
12/Feb/08 13:39
21 kB
Grant Ingersoll
SOLR-477.patch
12/Feb/08 04:28
22 kB
Grant Ingersoll

Activity

People

Assignee:: Grant Ingersoll

Reporter:: Grant Ingersoll

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 12/Feb/08 04:19

Updated:: 01/Nov/18 21:17

Resolved:: 20/Feb/08 04:07