Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
Description
I found that a PDF created by Exstream Dialogue Version 5.0.039 had ">> " between the endstream and endobj sections. When this happened, PDFBox threw an exception. This patch ignores junk characters between these sections so the files can be processed. A log message is written warning the user of the violation of the spec. For reference, here's the object I found in the file (excluding the stream data):
27 0 obj
<<
/Filter [/A85 /Fl]
/Length 322
>>
stream
(data from stream omitted)
endstream
>> endobj
%PDF Font (F315)
As a side note Exstream seems to have sold their Dialogue software to HP, and the current version is 7. This means the bug is likely fixed in the latest version, but there are still some older PDFs out there which PDFBox should be able to handle without throwing an exception.