Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-1111

Class loading issues when running in OSGi environment

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Duplicate
    • 1.3
    • None
    • packaging
    • None
    • Tika 1.3 (tika-core and tika-bundle OSGi bundles)
      Felix 2.0.5

    Description

      When dom4j is on the system classpath, a class loading error occurs during detection of Office Open XML files:

      java.lang.ExceptionInInitializerError
      at org.apache.poi.openxml4j.opc.internal.unmarshallers.PackagePropertiesUnmarshaller.<clinit>(PackagePropertiesUnmarshaller.java:49)
      at org.apache.poi.openxml4j.opc.OPCPackage.init(OPCPackage.java:154)
      at org.apache.poi.openxml4j.opc.OPCPackage.<init>(OPCPackage.java:141)
      at org.apache.poi.openxml4j.opc.Package.<init>(Package.java:54)
      at org.apache.poi.openxml4j.opc.ZipPackage.<init>(ZipPackage.java:99)
      at org.apache.poi.openxml4j.opc.OPCPackage.open(OPCPackage.java:207)
      at org.apache.tika.parser.pkg.ZipContainerDetector.detectOfficeOpenXML(ZipContainerDetector.java:194)
      at org.apache.tika.parser.pkg.ZipContainerDetector.detectZipFormat(ZipContainerDetector.java:134)
      at org.apache.tika.parser.pkg.ZipContainerDetector.detect(ZipContainerDetector.java:77)
      at org.apache.tika.detect.CompositeDetector.detect(CompositeDetector.java:61)
      at org.apache.tika.detect.CompositeDetector.detect(CompositeDetector.java:61)
      at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:113)
      at org.apache.tika.parser.ParsingReader$ParsingTask.run(ParsingReader.java:221)
      at java.lang.Thread.run(Thread.java:662)
      Caused by: java.lang.ClassCastException: org.dom4j.DocumentFactory cannot be cast to org.dom4j.DocumentFactory
      at org.dom4j.DocumentFactory.getInstance(DocumentFactory.java:97)
      at org.dom4j.tree.AbstractNode.<clinit>(AbstractNode.java:39)
      ... 14 more

      As a workaround (maybe a solution), I modified the context classloader when running the detection (wrapped the detector and parser). This appears to be the common fix for dom4j, as it uses the context classloader during initialization. Ideally, the detectors and parsers would be running with their original loader (from ServiceLoader) as context class loader.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              n.beekman Niels Beekman
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: