Solr
  1. Solr
  2. SOLR-2842

Re-factor UpdateChain and UpdateProcessor interfaces

    Details

    • Type: Improvement Improvement
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: update
    • Labels:
      None

      Description

      The UpdateChain's main task is to send SolrInputDocuments through a chain of UpdateRequestProcessors in order to transform them in some way and then (typically) indexing them.

      This generic "pipeline" concept would also be useful on the client side (SolrJ), so that we could choose to do parts or all of the processing on the client. The most prominent use case is extracting text (Tika) from large binary documents, residing on local storage on the client(s). Streaming hundreds of Mb over to Solr for processing is not efficcient. See SOLR-1526.

      We're already implementing Tika as an UpdateProcessor in SOLR-1763, and what would be more natural than reusing this - and any other processor - on the client side?

      However, for this to be possible, some interfaces need to change slightly..

        Issue Links

          Activity

          Hoss Man made changes -
          Link This issue relates to SOLR-2802 [ SOLR-2802 ]
          Jan Høydahl made changes -
          Field Original Value New Value
          Link This issue is required by SOLR-1526 [ SOLR-1526 ]
          Jan Høydahl created issue -

            People

            • Assignee:
              Unassigned
              Reporter:
              Jan Høydahl
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:

                Development