Uploaded image for project: 'PDFBox'
  1. PDFBox
  2. PDFBOX-3153

Direct JPEG extraction results in invalid images in 2.0.0 releases.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 1.8.10, 1.8.11, 2.0.0
    • 1.8.11, 2.0.0
    • PDModel
    • Observed on both Linux and Mac

    Description

      When I run pdfbox-app ExtractImages on a PDF containing an image with a DeviceRGB colorspace, the resulting JPEG file is very large (5.3MB, while the source PDF is 320KB).

      I see this with the 2.0.0-RC2 release, I also encounter the problem with a build from today's trunk.

      If I modify the code to force usage of ImageIO, a valid JPEG file results.

      The image extracts properly in the 1.8.10 version.

      Attachments

        1. parents.pdf
          313 kB
          John Logan

        Activity

          People

            tilman Tilman Hausherr
            nerff John Logan
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: