Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-4012

Improve extraction of embedded documents in PDFs

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 2.8.0
    • None
    • None

    Description

      We're currently processing the EmbeddedFiles entry in the name tree and annotations to look for file spec dictionaries. Unfortunately, PDFs may embed files in lots of other places. The newly free 2.0 spec makes this abundantly and painfully clear.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              tallison Tim Allison
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: