Details
-
Bug
-
Status: Closed
-
Minor
-
Resolution: Not A Bug
-
2.0.15
-
None
-
None
Description
PDICCBased.java uses synchronized with static variable, e.g. synchronized (LOG) . It doesn't look to me it really needs to do it this way. This is very inefficient when multiple threads process different PDF at the same time. Change it to synchronized (this) will improve the performance.
Sample code simulates multiple threads process different PDF at the same time:
public static void main(String[] args) throws IOException { for (int i = 0; i < 10; i++) { // just run multiple time doWork(); } } private static void doWork() throws IOException { long startTime = System.currentTimeMillis(); String pdfFilename = "<absolute path to your pdf file>"; // replace this with your test file System.setProperty("sun.java2d.cmm", "sun.java2d.cmm.kcms.KcmsServiceProvider"); PDDocument document = PDDocument.load(new File(pdfFilename)); List<PDDocument> pdfPages = new Splitter().split(document); Map<Integer, PDDocument> pdfPagesWithIndex = new HashMap<>(); for (int i = 0; i < pdfPages.size(); i++) { pdfPagesWithIndex.put(i, pdfPages.get(i)); } // multiple threads running in parallel pdfPagesWithIndex.entrySet().parallelStream().forEach(entry -> { try { processPDF(entry.getKey(), entry.getValue()); } catch (Exception e) { System.out.println(e); } }); System.out.println("Convertion time: " + (System.currentTimeMillis() - startTime)); try { document.close(); } catch (IOException ignored) { } } private static void processPDF(int index, PDDocument pdfPage) throws IOException { PDFRenderer renderer = new PDFRenderer(pdfPage); try { renderer.renderImageWithDPI(0, 180, ImageType.RGB); } catch (IOException e) { System.out.println(e); } try { pdfPage.close(); } catch (IOException ignored) { } }
I observed by changing synchronized (LOG) to synchronized (this), the above code can have maybe 20-30% reduction in latency. If I do a thread dump, I can see many threads are blocked on synchronized (LOG).
Attachments
Issue Links
- relates to
-
PDFBOX-3267 Using threads results in different images
- Closed
-
PDFBOX-3641 ICC_ColorSpace usage is not thread safe
- Closed