Uploaded image for project: 'PDFBox'
  1. PDFBox
  2. PDFBOX-4058

High memory consumption when extracting image from PDF file

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.0.5, 2.0.6, 2.0.7, 2.0.8
    • Fix Version/s: 2.0.9, 3.0.0 PDFBox
    • Component/s: Rendering
    • Labels:
    • Environment:
      windows 10 / Linux

      Description

      When rendering an image at 300 dpi from the included PDF, my java process uses a huge amount of memory.
      The document is only 45 Kb in size and contains 2 pages, my JVM is unable to extract even 1 page with 3G of memory. Setting Xmx to 4G works but is not the solution I want.
      The error occurs when calling PDFRenderer.renderImageWithDPI()

      I already tried tweaking the memory usage in my application to use a scratch file while loading the document as well as avoiding caching of XObjects as described here: https://pdfbox.apache.org/2.0/faq.html#outofmemoryerror
      These didn't work.

      The issue can be reproduced using the pdfbox-app utility:
      java -Xmx3G -jar pdfbox-app-2.0.8.jar PDFToImage
      HighMemoryFootprint.pdf -dpi 300 -color RGB -page 1

      What can not be changed?

      • 300 dpi will not be decreased.
      • Max Java memory will not be increased: 3GB is ridiculous for a 45kb PDF file.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                tilman Tilman Hausherr
                Reporter:
                bjorn.misseghers Bjorn Misseghers
              • Votes:
                1 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: