We're now extracting macros from msoffice files (TIKA-2069). We should do the equivalent for PDFs.
Make handling of macros equivalent btwn VBA in MSOffice and JS in PDFs
Extract <script> elements in html as "attachment" type MACRO like we do in the PDFParser
How hard could it be?
Fixed for extraction from common locations.
FAILURE: Integrated in Jenkins build tika-2.x-windows #81 (See https://builds.apache.org/job/tika-2.x-windows/81/)
SUCCESS: Integrated in Jenkins build Tika-trunk #1149 (See https://builds.apache.org/job/Tika-trunk/1149/)
TIKA-2090 – first draft (tallison: rev 7fbf0f304d8e20f7e26baadee1f85974b03dee8e)
SUCCESS: Integrated in Jenkins build tika-2.x #180 (See https://builds.apache.org/job/tika-2.x/180/)