Description
RandomAccessBuffer holds uncompressed image during operation because it is what exactly pdfbox ExtractImages do.
but holding uncompressed image instead of compressed one in memory consumes too much memory, not excluding many PDF XObjects that can use filter to compress itself. It would be good if pdfbox provides option that reverts to COSObject state just before the RandomAccess object created(the state that pdf XObject stream parsed and COSDictionary objects haven't created because user doesn't requested it using get____() method.) It is crucial feature so that pdfbox can analyze huge pdf file(>100MB).
In current source, one must close COSStream unless required(and I know closed stream cannot reopened again.)
Class Name | Shallow Heap | Retained Heap
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
org.apache.pdfbox.cos.COSObject @ 0x5ad4940 | 24 | 8,187,264
|
0 | 0 | ||
|
24 | 24 | ||
|
32 | 8,187,216 | ||
|
8 | 8 | ||
|
56 | 552 | ||
|
48 | 8,186,528 | ||
|
8 | 8 | ||
|
16,400 | |||
|
24 | 8,170,080 | ||
'- Total: 3 entries | ||||
|
32 | 32 | ||
|
16 | 16 | ||
|
32 | 32 | ||
'- Total: 6 entries | ||||
|
24 | 24 -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
Attachments
Attachments
Issue Links
- is depended upon by
-
PDFBOX-1586 IndexOutOfBoundsException when saving a document (at random)
- Closed
- is related to
-
PDFBOX-2825 Requested array size exceeds VM limit
- Closed