Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
2.0.3
-
None
-
None
-
Mac OSX 10.12.3, Java(TM) SE Runtime Environment (build 1.8.0_25-b17) Java HotSpot(TM) 64-Bit Server VM (build 25.25-b02, mixed mode)
Description
Problem
Using Tika 1.10 (PDF Box 1.8.10) to parse PDF documents in a multi-threaded application, processing unexpectedly halted. Investigating the output of a kill -3, we found:
"pool-2-thread-18" #50 prio=5 os_prio=0 tid=0x00002af088a67000 nid=0xc9b9 in Object.wait() [0x00002af0dc803000] java.lang.Thread.State: RUNNABLE at org.apache.pdfbox.pdfparser.BaseParser.parseDirObject(BaseParser.java:1348) x 15 "pool-2-thread-13" #45 prio=5 os_prio=0 tid=0x00002af0cf910800 nid=0xc9b4 in Object.wait() [0x00002af0dc2ff000] java.lang.Thread.State: RUNNABLE at org.apache.pdfbox.cos.COSDocument.getObjectFromPool(COSDocument.java:720) at org.apache.pdfbox.pdfparser.PDFParser.parseObject(PDFParser.java:685) x 2 "pool-2-thread-11" #43 prio=5 os_prio=0 tid=0x00002af0cfba6000 nid=0xc9b2 in Object.wait() [0x00002af0dc0fc000] java.lang.Thread.State: RUNNABLE at org.apache.pdfbox.cos.COSNumber.<clinit>(COSNumber.java:33) at org.apache.pdfbox.pdfparser.BaseParser.parseDirObject(BaseParser.java:1348) x 1
Upon further investigation, it appears that there is a risk for deadlock when BaseParser calls COSNumber.get() and COSDocument call COSInteger.get().
I was able to semi-reliably replicate this issue with the below Spock test:
import org.apache.pdfbox.cos.COSInteger import org.apache.pdfbox.cos.COSNumber import spock.lang.Specification class ThreadingIssueSpec extends Specification{ def "testy test"(){ setup: Thread thread = new Thread(new Runnable(){ @Override void run() { for (int i =0; i<100; i++){ COSNumber.get("-") } } }) thread.start() for(int i =0; i<100; i++) { COSInteger.get("-") } thread.join() expect: 1==1 } }
(you'll likely need to run this several times before the test hangs, but it does eventually hang)
I updated my Tika dep to 1.14 (PDF Box 2.0.3) and was still able to replicate this issue.
Attachments
Issue Links
- is related to
-
PDFBOX-4109 Static Initialization Deadlock between COSNumber/COSInteger (2)
- Closed