Description
See https://issues.apache.org/jira/browse/TIKA-611. The options used by PDFBox are currently hardwritten in the PDFParser code, we will allow them to be specified via the ParseContext objects
Attachments
Attachments
Issue Links
- blocks
-
SOLR-2930 Allow controlling an important PDF processing parameter in Tika that splits the words in text and is now suppored in version 1.0 of Tika.
- Open