Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-2523

Regression in ppt parsing -- "typeface can't be null or empty"

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      We noticed a regression in ppt parsing in POI 3.17 while running the large scale regression tests in prep for the release of Tika 1.17. There are about 200 new exceptions, but it looks like there is only one cause.

      Stacktrace:

      org.apache.poi.hslf.exceptions.HSLFException: Couldn't instantiate the class for type with id 1000 on class class org.apache.poi.hslf.record.Document : java.lang.reflect.InvocationTargetException
      Cause was : org.apache.poi.hslf.exceptions.HSLFException: Couldn't instantiate the class for type with id 1010 on class class org.apache.poi.hslf.record.Environment : java.lang.reflect.InvocationTargetException
      Cause was : org.apache.poi.hslf.exceptions.HSLFException: Couldn't instantiate the class for type with id 2005 on class class org.apache.poi.hslf.record.FontCollection : java.lang.reflect.InvocationTargetException
      Cause was : java.lang.IllegalArgumentException: typeface can't be null nor empty
      	at org.apache.poi.hslf.record.Record.createRecordForType(Record.java:186)
      	at org.apache.poi.hslf.record.Record.buildRecordAtOffset(Record.java:104)
      	at org.apache.poi.hslf.usermodel.HSLFSlideShowImpl.read(HSLFSlideShowImpl.java:279)
      	at org.apache.poi.hslf.usermodel.HSLFSlideShowImpl.buildRecords(HSLFSlideShowImpl.java:260)
      	at org.apache.poi.hslf.usermodel.HSLFSlideShowImpl.<init>(HSLFSlideShowImpl.java:166)
      	at org.apache.poi.hslf.usermodel.HSLFSlideShow.<init>(HSLFSlideShow.java:181)
      	at org.apache.tika.parser.microsoft.HSLFExtractor.parse(HSLFExtractor.java:78)
      

        Attachments

        1. 802350.ppt
          73 kB
          Tim Allison

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                tallison@apache.org Tim Allison
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated: