Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-839

TikaException with testPPT.potm in Tika GUI / CLI

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.1
    • 1.1
    • parser
    • None
    • Windows 7

    Description

      Attempting to open the testPPT.potm file found in the parsers' test-documents folder in a latest build of Tika will result in a TikaException, itself 'Caused by: org.apache.xmlbeans.XmlException: error: The document is not a presentation@http://schemas.openxmlformats.org/presentationml/2006/main: document element namespace mismatch expected "http://schemas.openxmlformats.org/presentationml/2006/main" got "http://schemas.openxmlformats.org/presentationml/2006/3/main"'. I opened this file in MS Office 2007, and it said that it was a file created with a beta version of Office, and that it would be updated the next time it was saved to a more up-to-date format. I made the contents look like that of the other Office 2007 presentation documents in the test-documents folder, and added this file and its mime type to the OOXMLParserTest class, and then had no problems with the .potm file. I'll attach a patch shortly.

      Attachments

        1. TIKA-839.patch
          1 kB
          John Mastarone
        2. testPPT.potm
          39 kB
          John Mastarone

        Activity

          People

            Unassigned Unassigned
            jfm.apache John Mastarone
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: