Description
If an object or xref starts at same line after the 'endobj' token and the closed object contains a stream, parsing of next object fails.
Example:
endstream
endobj xref
0 26
In PDFParser if an object contains a stream the 'endobj' token is read via readLine(). Thus the line break is consumed as well. Now the 'endobj' with following command is handled but only 'xref' is pushed back and not the line break which results in 'xref0' when trying to read next pbject. Thus in this case a simple solution is to push back a space byte before the 'xref'.
I will add a patch for it.
Part of the problem can be seen in PDF from http://onlinelibrary.wiley.com/doi/10.1111/j.1399-6576.2009.02134.x/pdf at last 'endobj'. However the last object does not contain a stream and I was not able to produce such a PDF (the PDFs I have containing described problematic construct are unfortunately confidential).
Attachments
Attachments
Issue Links
- is duplicated by
-
PDFBOX-978 unreading of trailing content after 'endobj' is missing new line byte (fix included)
- Closed