Uploaded image for project: 'PDFBox'
  1. PDFBox
  2. PDFBOX-3153

Direct JPEG extraction results in invalid images in 2.0.0 releases.

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.8.10, 1.8.11, 2.0.0
    • Fix Version/s: 1.8.11, 2.0.0
    • Component/s: PDModel
    • Labels:
    • Environment:
      Observed on both Linux and Mac

      Description

      When I run pdfbox-app ExtractImages on a PDF containing an image with a DeviceRGB colorspace, the resulting JPEG file is very large (5.3MB, while the source PDF is 320KB).

      I see this with the 2.0.0-RC2 release, I also encounter the problem with a build from today's trunk.

      If I modify the code to force usage of ImageIO, a valid JPEG file results.

      The image extracts properly in the 1.8.10 version.

        Attachments

        1. parents.pdf
          313 kB
          John Logan

          Activity

            People

            • Assignee:
              tilman Tilman Hausherr
              Reporter:
              nerff John Logan
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: