Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-3656

Tika returns wrong content type for docx types.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 2.2.0
    • None
    • None
    • None
    • Windows 10, Java 1.8

    Description

      Steps to reproduce

      1. Select a DOCX file say example.docx
      2. Rename the DOCX file to PDF say example.pdf
      3. Use Tika to detect the content type of the example.pdf file
      4. Returns application/zip instead  
        application/vnd.openxmlformats-officedocument.wordprocessingml.document

      Attachments

        Activity

          People

            Unassigned Unassigned
            Ajesh Ajesh
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: