Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
2.0.19
-
None
Description
Our PDF is parsed in less than 200ms in 2.0.18 and more then 8 seconds in 2.0.19. The same issue is still there in 2.0.26.
In version 2.0.19, SmallMap has been introduced. We're facing a performance issue since this modification.
We patch our code to just replace the SmallMap implementation like this:
package org.apache.pdfbox.util; import java.util.LinkedHashMap; public class SmallMap<K, V> extends LinkedHashMap<K, V> { // nothing : use the standard LinkedHashMap }
And the performance issue disappear.
Our test is really simple:
long start = System.currentTimeMillis(); try (PDDocument document = PDDocument.load(new File(inFile))) { // nothing : only parsing is evaluated } long duration = System.currentTimeMillis() -start; assertTrue(duration < 500);
I can understand that the SmallMap can solve issues in some cases, but it is possible to implement a factory to create this map and then allow to setup which Map implementation we want to use?
Attachments
Attachments
Issue Links
- relates to
-
PDFBOX-3284 Big Pdf parsing to text - Out of memory
- Closed