[SOLR-2842] Re-factor UpdateChain and UpdateProcessor interfaces - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Closed
Priority: Major
Resolution: Won't Fix
Affects Version/s: None
Fix Version/s: None
Component/s: update
Labels:
None

Description

The UpdateChain's main task is to send SolrInputDocuments through a chain of UpdateRequestProcessors in order to transform them in some way and then (typically) indexing them.

This generic "pipeline" concept would also be useful on the client side (SolrJ), so that we could choose to do parts or all of the processing on the client. The most prominent use case is extracting text (Tika) from large binary documents, residing on local storage on the client(s). Streaming hundreds of Mb over to Solr for processing is not efficcient. See ~~SOLR-1526~~.

We're already implementing Tika as an UpdateProcessor in ~~SOLR-1763~~, and what would be more natural than reusing this - and any other processor - on the client side?

However, for this to be possible, some interfaces need to change slightly..

Attachments

Issue Links

is required by

SOLR-1526 Client Side Tika integration

Resolved

relates to

SOLR-2802 Toolkit of UpdateProcessors for modifying document values

Closed

Activity

People

Assignee:: Unassigned

Reporter:: Jan Høydahl

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 16/Oct/11 13:33

Updated:: 28/Sep/16 11:19

Resolved:: 28/Sep/16 11:19