Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-1256

MS Office 07 excel ".xlsx" file Tika 1.4 api is detecting wrong mimetype.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Not A Problem
    • 1.4
    • None
    • mime

    Description

      I am using Tika 1.4 jars for standard alone project.
      While running from eclipse Tika 1.4 jars detecting correct mimetype,
      I build jar file from my project and running my standalone project from command prompt its detecting wrong mimetype.
      I am attaching my code
      Parser parser = new AutoDetectParser();
      InputStream stream = new FileInputStream(file);
      int writeUnlimited = -1;
      ContentHandler contentHandler = new BodyContentHandler(writeUnlimited);
      Metadata metadata = new Metadata();
      parser.parse(stream, contentHandler, metadata, new ParseContext());
      mimeType = metadata.get(Metadata.CONTENT_TYPE);
      logger.info("Correct MimeType value for '" + file.getName() + "' file is: " + mimeType);
      Output from eclipse is
      Correct MimeType value for 'CIQ_83517.xlsx' file is: application/vnd.openxmlformats-officedocument.spreadsheetml.sheet

      Output from command prompt
      Correct MimeType value for 'CIQ_83517.xlsx' file is: application/x-tika-ooxml

      I have only tika 1.4 and its dependent jar files.
      Is it issue with my code or tika1.4 jar has some issue?
      Iam using java 1.6 version.

      Thanks for your help

      Attachments

        Activity

          People

            Unassigned Unassigned
            kavitha.1989 Kavitha
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: