Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-3023

Text files starting with MOVI are detected as X-SGI-Movie

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 1.23
    • 1.24
    • None
    • None
    • Issue recreated on

      Windows 10 Professional 64bit running the runnable Jar

      Ubuntu 16.04.6 LTS running Tika-Python

    Description

      If a plaintext file starts with "MOVI" Tika labels it as an SGI Movie.

      The hex conversion for MOVI is 4D 4F 56 49 which is the same as the header for the SGI Movie file format

      https://reposcope.com/mimetype/video/x-sgi-movie

       

      This SGI format isn't supported so any information from a text file starting like this would be lost. I've attached a simple file that should recreate the problem.

      Attachments

        1. capitalmovie.txt
          0.0 kB
          Steve

        Activity

          People

            Unassigned Unassigned
            ssundberg Steve
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: