Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-348

Tika can't parse XLSX when build with latest POI trunk version

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 0.5
    • 0.6
    • parser
    • None

    Description

      OOXMLParserTest fails:

      org.apache.tika.exception.TikaException: Unexpected RuntimeException from org.apache.tika.parser.microsoft.ooxml.OOXMLParser@82d37
      at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:122)
      at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:101)
      at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:114)
      at org.apache.tika.parser.microsoft.ooxml.OOXMLParserTest.testExcel(OOXMLParserTest.java:43)
      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
      at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:40)
      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
      at com.intellij.rt.execution.application.AppMain.main(AppMain.java:90)
      Caused by: java.lang.IllegalStateException: Cannot get a text value from a numeric formula cell
      at org.apache.poi.xssf.usermodel.XSSFCell.typeMismatch(XSSFCell.java:781)
      at org.apache.poi.xssf.usermodel.XSSFCell.checkFormulaCachedValueType(XSSFCell.java:286)
      at org.apache.poi.xssf.usermodel.XSSFCell.getRichStringCellValue(XSSFCell.java:274)
      at org.apache.poi.xssf.usermodel.XSSFCell.getRichStringCellValue(XSSFCell.java:63)
      at org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.buildXHTML(XSSFExcelExtractorDecorator.java:72)
      at org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor.getXHTML(AbstractOOXMLExtractor.java:69)
      at org.apache.tika.parser.microsoft.ooxml.OOXMLParser.parse(OOXMLParser.java:49)
      at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:120)
      ... 26 more

      Attachments

        1. TIKA-348.patch
          2 kB
          Maxim Valyanskiy

        Activity

          People

            chrismattmann Chris A. Mattmann
            maxim.valyanskiy Maxim Valyanskiy
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: