Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
1.8.12, 2.0.1
-
None
Description
I have a PDF which does not render in PDFBox. It contains pages from a scanner, encoded as CCITT Fax Tiffs. On each page, the decoder always runs into IOException("TIFFFaxDecoder: EOL encountered in black run.") (or the same message just with "white" instead of "black"). Unfortunately, the PDF contains sensitive data and I cannot share it.
As a test, I have replaced the TIFFFaxDecoder by the class CCITTFaxDecoderStream from the Twelve Monkeys ImageIO library. All worked fine after that and PDFToImage produced the expected result.
I have extracted the first few bytes of the TIFF to show the problem without sharing the confidential content. See the attached test program and test file.
I have tested this against latest trunk version of PDFBox, but I think the decoder implementation is basically the same in all versions.
Attachments
Attachments
Issue Links
- duplicates
-
PDFBOX-2779 PDF to Image Conversion fails with "EOL encountered in white run"
- Closed
- is duplicated by
-
PDFBOX-2779 PDF to Image Conversion fails with "EOL encountered in white run"
- Closed
- links to