Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-1554

Improve EMF file detection

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 1.7
    • 1.8
    • detector
    • None

    Description

      I am getting many files being incorrectly detected as application/x-emf. I think the current magic is too common. According to MS documentation (https://msdn.microsoft.com/en-us/library/cc230635.aspx and https://msdn.microsoft.com/en-us/library/dd240211.aspx), it can be improved to:

      <mime-type type="application/x-emf">
          <acronym>EMF</acronym>
          <_comment>Extended Metafile</_comment>
          <glob pattern="*.emf"/>
          <magic priority="50">
            <match value="0x01000000" type="string" offset="0">
            	<match value=" EMF" type="string" offset="40"/>
            </match>
          </magic>
        </mime-type>
      

      Attachments

        1. nonEmf.dat
          0.0 kB
          Luís Filipe Nassif

        Activity

          People

            chrismattmann Chris A. Mattmann
            lfcnassif Luís Filipe Nassif
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: