Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-7229

Allow DIH to handle attachments as separate documents

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Minor
    • Resolution: Won't Fix
    • None
    • None
    • None

    Description

      With Tika 1.7's RecursiveParserWrapper, it is possible to maintain metadata of individual attachments/embedded documents. Tika's default handling was to maintain the metadata of the container document and concatenate the contents of all embedded files. With SOLR-7189, we added the legacy behavior.

      It might be handy, for example, to be able to send an MSG file through DIH and treat the container email as well each attachment as separate (child?) documents, or send a zip of jpeg files and correctly index the geo locations for each image file.

      Attachments

        Issue Links

          Activity

            People

              arafalov Alexandre Rafalovitch
              tallison Tim Allison
              Votes:
              2 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: