Uploaded image for project: 'PDFBox'
  1. PDFBox
  2. PDFBOX-5166

Implement RichMedia annotation

    XMLWordPrintableJSON

    Details

    • Type: New Feature
    • Status: Open
    • Priority: Minor
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: PDModel
    • Labels:

      Description

      See TIKA-3359. The attached file as an embedded Flash/swf file. Tika is not currently extracting the embedded file.

      In the debugger, I can see the Annotation as a PDAnnotationUnknown. In the COSDictionary, I can see the subtype is "RichMedia". If someone has the time, it'd be great to implement this so that we can extract more attachments in Tika... Obv, others may find use too.

      Many thanks to Tyler Thorsted for the test file and many thanks to @terminalboredom and @beet_keeper.

        Attachments

        1. testFlashInPDF.pdf
          158 kB
          Tim Allison

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              tallison Tim Allison
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated: