Uploaded image for project: 'PDFBox'
  1. PDFBox
  2. PDFBOX-3338

CCITT Fax decoder fails

VotersWatch issueWatchersLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    Description

      I have a PDF which does not render in PDFBox. It contains pages from a scanner, encoded as CCITT Fax Tiffs. On each page, the decoder always runs into IOException("TIFFFaxDecoder: EOL encountered in black run.") (or the same message just with "white" instead of "black"). Unfortunately, the PDF contains sensitive data and I cannot share it.

      As a test, I have replaced the TIFFFaxDecoder by the class CCITTFaxDecoderStream from the Twelve Monkeys ImageIO library. All worked fine after that and PDFToImage produced the expected result.

      I have extracted the first few bytes of the TIFF to show the problem without sharing the confidential content. See the attached test program and test file.

      I have tested this against latest trunk version of PDFBox, but I think the decoder implementation is basically the same in all versions.

      Attachments

        1. TestCCITTFaxDecoder.java
          1 kB
          Petr Slaby
        2. PDFBOX-3338-014261-p3.pdf
          104 kB
          Tilman Hausherr
        3. CCITTFaxFilter.patch
          33 kB
          Petr Slaby
        4. CCITTFaxDecoderStream-Changes-by-Petr-and-Tilman-diff.txt
          5 kB
          Tilman Hausherr
        5. CCITTFaxDecoderStream.java
          25 kB
          Tilman Hausherr
        6. 1.tiff
          0.1 kB
          Petr Slaby

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            tilman Tilman Hausherr
            pslabycz Petr Slaby
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment