Details
-
Task
-
Status: Closed
-
Trivial
-
Resolution: Fixed
-
3.0.0 PDFBox
-
None
Description
On TIKA-3347, we're integrating PDFBox 3.0.0-RC1. We're getting new flate filter exceptions on a set of files that I think I created with PDFBox a while ago.
Looks like we're also getting xref exceptions.
I would not be surprised in the least to learn that I did something wrong in the creation of these files and that they are corrupt!
I can replicate this issue with java -jar pdfbox-app-3.0.0-RC1.jar export:text
SEVERE: FlateFilter: stop reading corrupt stream due to a DataFormatException Error extracting text for document [IOException]: java.util.zip.DataFormatException: invalid block type