Description
I am using tika to extract text and feed it to my lucene indexer. Tika is throwing a null pointer exception for a particular xlsx file. It works fine while testing on other xlsx file and only throws an exception on this particular file. I'll be attaching the xlslx file for you to check out. Kindly help me out.
Code :-
String path = "D:
CVLKRA-KYC_Download_File_Structure_V3.1.xlsx";String path = "D:
CVLKRA-KYC_Download_File_Structure_V3.1.xlsx";
File file = new File(path);
System.out.print(tika.parseToString(file));
Error :-
Exception in thread "main" org.apache.tika.exception.TikaException: Unexpected RuntimeException from org.apache.tika.parser.microsoft.ooxml.OOXMLParser@54a67a45Exception in thread "main" org.apache.tika.exception.TikaException: Unexpected RuntimeException from org.apache.tika.parser.microsoft.ooxml.OOXMLParser@54a67a45 at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:293) at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280) at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:143) at org.apache.tika.Tika.parseToString(Tika.java:527) at org.apache.tika.Tika.parseToString(Tika.java:642) at poc.please.TikaPoc.main(TikaPoc.java:42)Caused by: java.lang.NullPointerException at org.apache.poi.xssf.usermodel.XSSFTableStyle.<init>(XSSFTableStyle.java:64) at org.apache.poi.xssf.model.StylesTable.readFrom(StylesTable.java:245) at org.apache.poi.xssf.model.StylesTable.<init>(StylesTable.java:138) at org.apache.poi.xssf.eventusermodel.XSSFReader.getStylesTable(XSSFReader.java:127) at org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.buildXHTML(XSSFExcelExtractorDecorator.java:143) at org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor.getXHTML(AbstractOOXMLExtractor.java:136) at org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.getXHTML(XSSFExcelExtractorDecorator.java:126) at org.apache.tika.parser.microsoft.ooxml.OOXMLExtractorFactory.parse(OOXMLExtractorFactory.java:210) at org.apache.tika.parser.microsoft.ooxml.OOXMLParser.parse(OOXMLParser.java:113) at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280) ... 5 more
Attachments
Attachments
Issue Links
- depends upon
-
TIKA-3164 Upgrade to POI 5.0.0 when available
- Resolved