Details
-
Bug
-
Status: Resolved
-
Critical
-
Resolution: Won't Fix
-
7.3.1
-
None
-
None
Description
When indexing a BLOB containing an encrypted Office Document (xls or xlsx but I think all types) it fail with a very bad exception, if the document is not encrypted works fine.
I'm using the DataImportHandler.
The exception seems also avoid the onError=skip or continue, making the import fail.
I tried to move the libraries from contrib/extraction/lib/ to server/lib and the unfounded class changes, so it's a class loading issue.
This is the base exception:
Exception while processing: document_index document : SolrInputDocument(fields: [site=187, index_type=document, resource_id=3, title_full=Dati cliente.docx, id=d-XXX-3, publish_date=2018-09-28 00:00:00.0, abstract= Azioni di recupero intraprese sulle Fatture telefoniche, insert_date=2019-09-28 00:00:00.0, type=Documenti, url=http://]):org.apache.solr.handler.dataimport.DataImportHandlerException: Unable to read content Processing Document # 1
at org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:69)
at org.apache.solr.handler.dataimport.TikaEntityProcessor.nextRow(TikaEntityProcessor.java:171)
at org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityProcessorWrapper.java:267)
at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:476)
at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:517)
at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:415)
at org.apache.solr.handler.dataimport.DocBuilder.doDelta(DocBuilder.java:364)
at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:225)
at org.apache.solr.handler.dataimport.DataImporter.doDeltaImport(DataImporter.java:452)
at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:485)
at org.apache.solr.handler.dataimport.DataImporter.lambda$runAsync$0(DataImporter.java:466)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.tika.exception.TikaException: TIKA-198: Illegal IOException from org.apache.tika.parser.microsoft.OfficeParser@500efcf1
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:286)
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:143)
at org.apache.solr.handler.dataimport.TikaEntityProcessor.nextRow(TikaEntityProcessor.java:165)
... 10 more
Caused by: java.io.IOException: java.lang.ClassNotFoundException: org.apache.poi.poifs.crypt.agile.AgileEncryptionInfoBuilder
at org.apache.poi.poifs.crypt.EncryptionInfo.<init>(EncryptionInfo.java:150)
at org.apache.poi.poifs.crypt.EncryptionInfo.<init>(EncryptionInfo.java:102)
at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:203)
at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:132)
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
... 13 more
Caused by: java.lang.ClassNotFoundException: org.apache.poi.poifs.crypt.agile.AgileEncryptionInfoBuilder
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at org.eclipse.jetty.webapp.WebAppClassLoader.loadClass(WebAppClassLoader.java:565)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at org.apache.poi.poifs.crypt.EncryptionInfo.getBuilder(EncryptionInfo.java:222)
at org.apache.poi.poifs.crypt.EncryptionInfo.<init>(EncryptionInfo.java:148)
... 17 more
Attachments
Attachments
Issue Links
- is superceded by
-
SOLR-14783 Remove DIH from 9.0
- Closed