Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-2922

Regression issue with detecting .dotx and .xlam MS Office mime-types

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 1.22
    • 1.23
    • parser
    • None
    • N/A

    Description

      After upgrading to 1.22, .dotx and .xlam files are no longer detected properly. 

      They are now detected as:

       

      .dotx -> vnd.ms-word.template.macroenabled.12
      .xlam -> application/x-tika-ooxml

       

      They should be detected like they originally were: 

      .dotx -> vnd.openxmlformats-officedocument.wordprocessingml.template
      .xlam -> application/vnd.ms-excel.addin.macroenabled.12

      Reference: https://docs.microsoft.com/en-us/previous-versions/office/office-2007-resource-kit/ee309278(v=office.12)

      It is happening in StreamingZipContainerDetector and ZipContainerDetectorBase.

      I will submit a pull request shortly with the correct mapping.

       

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              pascal.essiembre Pascal Essiembre
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: