Details
Description
On the attached Word document, which opens fine in Word, the Tika parser throws the following error:
java.lang.NullPointerException
at org.apache.poi.xwpf.usermodel.XWPFStyles.getStyle(XWPFStyles.java:198)
at org.apache.tika.parser.microsoft.ooxml.XWPFWordExtractorDecorator.extractParagraph(XWPFWordExtractorDecorator.java:149)
at org.apache.tika.parser.microsoft.ooxml.XWPFWordExtractorDecorator.extractIBodyText(XWPFWordExtractorDecorator.java:107)
at org.apache.tika.parser.microsoft.ooxml.XWPFWordExtractorDecorator.extractTable(XWPFWordExtractorDecorator.java:362)
at org.apache.tika.parser.microsoft.ooxml.XWPFWordExtractorDecorator.extractHeaderText(XWPFWordExtractorDecorator.java:414)
at org.apache.tika.parser.microsoft.ooxml.XWPFWordExtractorDecorator.extractHeaders(XWPFWordExtractorDecorator.java:404)
at org.apache.tika.parser.microsoft.ooxml.XWPFWordExtractorDecorator.buildXHTML(XWPFWordExtractorDecorator.java:89)
at org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor.getXHTML(AbstractOOXMLExtractor.java:109)
at org.apache.tika.parser.microsoft.ooxml.OOXMLExtractorFactory.parse(OOXMLExtractorFactory.java:112)
at org.apache.tika.parser.microsoft.ooxml.OOXMLParser.parse(OOXMLParser.java:87)
Attachments
Attachments
Issue Links
- depends upon
-
TIKA-2181 Upgrade to POI 3.16-beta2 when available
- Resolved