Uploaded image for project: 'PDFBox'
  1. PDFBox
  2. PDFBOX-4058

High memory consumption when extracting image from PDF file

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 2.0.5, 2.0.6, 2.0.7, 2.0.8
    • 2.0.9, 3.0.0 PDFBox
    • Rendering
    • windows 10 / Linux

    Description

      When rendering an image at 300 dpi from the included PDF, my java process uses a huge amount of memory.
      The document is only 45 Kb in size and contains 2 pages, my JVM is unable to extract even 1 page with 3G of memory. Setting Xmx to 4G works but is not the solution I want.
      The error occurs when calling PDFRenderer.renderImageWithDPI()

      I already tried tweaking the memory usage in my application to use a scratch file while loading the document as well as avoiding caching of XObjects as described here: https://pdfbox.apache.org/2.0/faq.html#outofmemoryerror
      These didn't work.

      The issue can be reproduced using the pdfbox-app utility:
      java -Xmx3G -jar pdfbox-app-2.0.8.jar PDFToImage
      HighMemoryFootprint.pdf -dpi 300 -color RGB -page 1

      What can not be changed?

      • 300 dpi will not be decreased.
      • Max Java memory will not be increased: 3GB is ridiculous for a 45kb PDF file.

      Attachments

        1. HighMemoryFootprint.pdf
          44 kB
          Bjorn Misseghers

        Issue Links

          Activity

            People

              tilman Tilman Hausherr
              bjorn.misseghers Bjorn Misseghers
              Votes:
              1 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: