Description
There are several files producing the same exception within the TIKA test arena:
Caused by: java.io.IOException: Catalog cannot be found org.apache.pdfbox.cos.COSDocument.getCatalog(COSDocument.java:363) org.apache.pdfbox.pdfparser.PDFParser.initialParse(PDFParser.java:200) org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:230) org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:854) org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:797) org.apache.pdfbox.debugger.PDFDebugger.parseDocument(PDFDebugger.java:1216) org.apache.pdfbox.debugger.PDFDebugger.readPDFFile(PDFDebugger.java:1144) org.apache.pdfbox.debugger.PDFDebugger.readPDFFile(PDFDebugger.java:1129) org.apache.pdfbox.debugger.PDFDebugger.access$13(PDFDebugger.java:1126) org.apache.pdfbox.debugger.PDFDebugger$11.actionPerformed(PDFDebugger.java:1230) java.security.AccessController.doPrivileged(Native Method) java.security.ProtectionDomain$1.doIntersectionPrivilege(ProtectionDomain.java:76) java.security.ProtectionDomain$1.doIntersectionPrivilege(ProtectionDomain.java:87) java.security.AccessController.doPrivileged(Native Method) java.security.ProtectionDomain$1.doIntersectionPrivilege(ProtectionDomain.java:76)
1.8.10 (non sequential parser only) and 2.0.0 are producing the same exception.