Uploaded image for project: 'PDFBox'
  1. PDFBox
  2. PDFBOX-4484

Some JBIG2 images are corrupted when subsampling is enabled

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 2.0.9, 2.0.10, 2.0.11, 2.0.12, 2.0.14
    • 2.0.15
    • Rendering

    Description

      We have observed problems when rendering some PDFs containing JBIG2 images if the subsampling option is enabled.

      It is possible to reproduce this with the file I have attached using the CLI:

      $ java -jar pdfbox-app-2.0.14.jar PDFToImage -subsampling test-jbig2.pdf
      Mar 12, 2019 3:17:09 PM org.apache.pdfbox.pdmodel.graphics.image.SampledImageReader from1Bit
      WARNING: premature EOF, image will be incomplete
      

      In the output, the extracted image is distorted and has incorrect dimensions. This does not appear to affect all JBIG2 images, for instance the file "ItDoesntWorkScan.pdf" attached to PDFBOX-1067 contains a JBIG2 image and is rendered correctly with the above code. PDFs containing images encoded with algorithms other than JBIG2 also appear to be OK, although I haven't tested this exhaustively.

      We have worked around this problem for now by disabling the subsampling option.

      Attachments

        1. test-jbig2.pdf
          60 kB
          Pete Nattress
        2. test-jbig21-fixed-subsampling.jpg
          371 kB
          Pete Nattress
        3. test-jbig21-no-subsampling.jpg
          388 kB
          Pete Nattress
        4. test-jbig21-subsampling.jpg
          172 kB
          Pete Nattress

        Activity

          People

            tilman Tilman Hausherr
            pnattress Pete Nattress
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: