Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-12985

ClassNotFound indexing crypted documents

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Critical
    • Resolution: Won't Fix
    • Affects Version/s: 7.3.1
    • Fix Version/s: None
    • Labels:
      None

      Description

      When indexing a BLOB containing an encrypted Office Document (xls or xlsx but I think all types) it fail with a very bad exception, if the document is not encrypted works fine.

      I'm using the DataImportHandler.

      The exception seems also avoid the onError=skip or continue, making the import fail.

      I tried to move the libraries from contrib/extraction/lib/ to server/lib and the unfounded class changes, so it's a class loading issue.

      This is the base exception:

      Exception while processing: document_index document : SolrInputDocument(fields: [site=187, index_type=document, resource_id=3, title_full=Dati cliente.docx, id=d-XXX-3, publish_date=2018-09-28 00:00:00.0, abstract= Azioni di recupero intraprese sulle Fatture telefoniche, insert_date=2019-09-28 00:00:00.0, type=Documenti, url=http://]):org.apache.solr.handler.dataimport.DataImportHandlerException: Unable to read content Processing Document # 1
          at org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:69)
          at org.apache.solr.handler.dataimport.TikaEntityProcessor.nextRow(TikaEntityProcessor.java:171)
          at org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityProcessorWrapper.java:267)
          at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:476)
          at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:517)
          at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:415)
          at org.apache.solr.handler.dataimport.DocBuilder.doDelta(DocBuilder.java:364)
          at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:225)
          at org.apache.solr.handler.dataimport.DataImporter.doDeltaImport(DataImporter.java:452)
          at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:485)
          at org.apache.solr.handler.dataimport.DataImporter.lambda$runAsync$0(DataImporter.java:466)
          at java.lang.Thread.run(Thread.java:748)
      Caused by: org.apache.tika.exception.TikaException: TIKA-198: Illegal IOException from org.apache.tika.parser.microsoft.OfficeParser@500efcf1
          at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:286)
          at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
          at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:143)
          at org.apache.solr.handler.dataimport.TikaEntityProcessor.nextRow(TikaEntityProcessor.java:165)
          ... 10 more
      Caused by: java.io.IOException: java.lang.ClassNotFoundException: org.apache.poi.poifs.crypt.agile.AgileEncryptionInfoBuilder
          at org.apache.poi.poifs.crypt.EncryptionInfo.<init>(EncryptionInfo.java:150)
          at org.apache.poi.poifs.crypt.EncryptionInfo.<init>(EncryptionInfo.java:102)
          at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:203)
          at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:132)
          at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
          ... 13 more
      Caused by: java.lang.ClassNotFoundException: org.apache.poi.poifs.crypt.agile.AgileEncryptionInfoBuilder
          at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
          at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
          at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
          at org.eclipse.jetty.webapp.WebAppClassLoader.loadClass(WebAppClassLoader.java:565)
          at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
          at org.apache.poi.poifs.crypt.EncryptionInfo.getBuilder(EncryptionInfo.java:222)
          at org.apache.poi.poifs.crypt.EncryptionInfo.<init>(EncryptionInfo.java:148)
          ... 17 more

        Attachments

        1. crypted.xlsx
          16 kB
          Luca
        2. db.sql
          21 kB
          Luca
        3. logs.zip
          30 kB
          Luca
        4. notcrypted.docx
          11 kB
          Luca
        5. schema.zip
          19 kB
          Luca

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                lucaver Luca
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: