Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-1407

Unexpected RuntimeException from org.apache.tika.parser.microsoft.OfficeParser@5d11346a

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.5
    • 1.6
    • parser
    • None
    • Kubuntu 14.04

    Description

      I'm trying to parse a document created with Powerpoint for Mac.
      This crash Tika. However, interestingly, i can open it with LibreOffice. If i save it using the same format, it loses some kilobytes and works.
      The failing file is at http://amoki.fr/anyFetch_pitch_deck_Allianz_EN_withoutslide9.ppt

      I get the following error using tika 1.5:

      Exception in thread "main" org.apache.tika.exception.TikaException: Unexpected RuntimeException from org.apache.tika.parser.microsoft.OfficeParser@5d11346a
      at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:244)
      at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
      at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
      at org.apache.tika.cli.TikaCLI$OutputType.process(TikaCLI.java:142)
      at org.apache.tika.cli.TikaCLI.process(TikaCLI.java:418)
      at org.apache.tika.cli.TikaCLI.main(TikaCLI.java:112)
      Caused by: java.lang.RuntimeException: Couldn't instantiate the class for type with id 5000 on class class org.apache.poi.hslf.record.DummyPositionSensitiveRecordWithChildren : java.lang.reflect.InvocationTargetException
      Cause was : java.lang.RuntimeException: Couldn't instantiate the class for type with id 5002 on class class org.apache.poi.hslf.record.DummyPositionSensitiveRecordWithChildren : java.lang.reflect.InvocationTargetException
      Cause was : java.lang.RuntimeException: Couldn't instantiate the class for type with id 5003 on class class org.apache.poi.hslf.record.BinaryTagDataBlob : java.lang.reflect.InvocationTargetException
      Cause was : java.lang.RuntimeException: Couldn't instantiate the class for type with id 4012 on class class org.apache.poi.hslf.record.StyleTextProp9Atom : java.lang.reflect.InvocationTargetException
      Cause was : java.lang.ArrayIndexOutOfBoundsException: 20
      at org.apache.poi.hslf.record.Record.createRecordForType(Record.java:185)
      at org.apache.poi.hslf.record.Record.findChildRecords(Record.java:128)
      at org.apache.poi.hslf.model.SimpleShape.getClientRecords(SimpleShape.java:347)
      at org.apache.poi.hslf.model.SimpleShape.getClientDataRecord(SimpleShape.java:319)
      at org.apache.poi.hslf.model.TextShape.getPlaceholderAtom(TextShape.java:596)
      at org.apache.poi.hslf.model.Sheet.getPlaceholder(Sheet.java:443)
      at org.apache.poi.hslf.model.HeadersFooters.isVisible(HeadersFooters.java:244)
      at org.apache.poi.hslf.model.HeadersFooters.isHeaderVisible(HeadersFooters.java:148)
      at org.apache.tika.parser.microsoft.HSLFExtractor.parse(HSLFExtractor.java:62)
      at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:202)
      at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:167)
      at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
      ... 5 more
      Caused by: java.lang.reflect.InvocationTargetException
      at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
      at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
      at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
      at java.lang.reflect.Constructor.newInstance(Constructor.java:408)
      at org.apache.poi.hslf.record.Record.createRecordForType(Record.java:181)
      ... 16 more
      Caused by: java.lang.RuntimeException: Couldn't instantiate the class for type with id 5002 on class class org.apache.poi.hslf.record.DummyPositionSensitiveRecordWithChildren : java.lang.reflect.InvocationTargetException
      Cause was : java.lang.RuntimeException: Couldn't instantiate the class for type with id 5003 on class class org.apache.poi.hslf.record.BinaryTagDataBlob : java.lang.reflect.InvocationTargetException
      Cause was : java.lang.RuntimeException: Couldn't instantiate the class for type with id 4012 on class class org.apache.poi.hslf.record.StyleTextProp9Atom : java.lang.reflect.InvocationTargetException
      Cause was : java.lang.ArrayIndexOutOfBoundsException: 20
      at org.apache.poi.hslf.record.Record.createRecordForType(Record.java:185)
      at org.apache.poi.hslf.record.Record.findChildRecords(Record.java:128)
      at org.apache.poi.hslf.record.DummyPositionSensitiveRecordWithChildren.<init>(DummyPositionSensitiveRecordWithChildren.java:52)
      ... 21 more
      Caused by: java.lang.reflect.InvocationTargetException
      at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
      at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
      at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
      at java.lang.reflect.Constructor.newInstance(Constructor.java:408)
      at org.apache.poi.hslf.record.Record.createRecordForType(Record.java:181)
      ... 23 more
      Caused by: java.lang.RuntimeException: Couldn't instantiate the class for type with id 5003 on class class org.apache.poi.hslf.record.BinaryTagDataBlob : java.lang.reflect.InvocationTargetException
      Cause was : java.lang.RuntimeException: Couldn't instantiate the class for type with id 4012 on class class org.apache.poi.hslf.record.StyleTextProp9Atom : java.lang.reflect.InvocationTargetException
      Cause was : java.lang.ArrayIndexOutOfBoundsException: 20
      at org.apache.poi.hslf.record.Record.createRecordForType(Record.java:185)
      at org.apache.poi.hslf.record.Record.findChildRecords(Record.java:128)
      at org.apache.poi.hslf.record.DummyPositionSensitiveRecordWithChildren.<init>(DummyPositionSensitiveRecordWithChildren.java:52)
      ... 28 more
      Caused by: java.lang.reflect.InvocationTargetException
      at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
      at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
      at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
      at java.lang.reflect.Constructor.newInstance(Constructor.java:408)
      at org.apache.poi.hslf.record.Record.createRecordForType(Record.java:181)
      ... 30 more
      Caused by: java.lang.RuntimeException: Couldn't instantiate the class for type with id 4012 on class class org.apache.poi.hslf.record.StyleTextProp9Atom : java.lang.reflect.InvocationTargetException
      Cause was : java.lang.ArrayIndexOutOfBoundsException: 20
      at org.apache.poi.hslf.record.Record.createRecordForType(Record.java:185)
      at org.apache.poi.hslf.record.Record.findChildRecords(Record.java:128)
      at org.apache.poi.hslf.record.BinaryTagDataBlob.<init>(BinaryTagDataBlob.java:52)
      ... 35 more
      Caused by: java.lang.reflect.InvocationTargetException
      at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
      at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
      at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
      at java.lang.reflect.Constructor.newInstance(Constructor.java:408)
      at org.apache.poi.hslf.record.Record.createRecordForType(Record.java:181)
      ... 37 more
      Caused by: java.lang.ArrayIndexOutOfBoundsException: 20
      at org.apache.poi.util.LittleEndian.getInt(LittleEndian.java:161)
      at org.apache.poi.hslf.record.StyleTextProp9Atom.<init>(StyleTextProp9Atom.java:70)
      ... 42 more

      Attachments

        Activity

          People

            Unassigned Unassigned
            Neamar Matthieu Neamar
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: