[PDFBOX-4058] High memory consumption when extracting image from PDF file - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: 2.0.5, 2.0.6, 2.0.7, 2.0.8
Fix Version/s: 2.0.9, 3.0.0 PDFBox
Component/s: Rendering
Labels:
- regression
Environment:
windows 10 / Linux

Description

When rendering an image at 300 dpi from the included PDF, my java process uses a huge amount of memory.
The document is only 45 Kb in size and contains 2 pages, my JVM is unable to extract even 1 page with 3G of memory. Setting Xmx to 4G works but is not the solution I want.
The error occurs when calling PDFRenderer.renderImageWithDPI()

I already tried tweaking the memory usage in my application to use a scratch file while loading the document as well as avoiding caching of XObjects as described here: https://pdfbox.apache.org/2.0/faq.html#outofmemoryerror
These didn't work.

The issue can be reproduced using the pdfbox-app utility:
java -Xmx3G -jar pdfbox-app-2.0.8.jar PDFToImage
HighMemoryFootprint.pdf -dpi 300 -color RGB -page 1

What can not be changed?

300 dpi will not be decreased.
Max Java memory will not be increased: 3GB is ridiculous for a 45kb PDF file.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

HighMemoryFootprint.pdf
08/Jan/18 14:48
44 kB
Bjorn Misseghers

Issue Links

relates to

PDFBOX-3688 Cache TilingPaint generation

Closed

links to

JDK-4802550: java.util.WeakHashMap throws OutOfMemoryError on Linux

Activity

People

Assignee:: Tilman Hausherr

Reporter:: Bjorn Misseghers

Votes:: 1 Vote for this issue

Watchers:: 5 Start watching this issue

Dates

Created:: 08/Jan/18 15:19

Updated:: 24/Mar/18 09:41

Resolved:: 10/Jan/18 20:44