Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-1141

javascript files that contain "<html" are detected as text/html

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • 1.2
    • None
    • mime
    • None

    Description

      The Mimetypes detector will return text/html as the mimetype for any javascript file that contains the string "<html" in it. I believe this is due to the rule <match value="<html" type="string" offset="0:8192"/> in the tika-mimetypes.xml file.

      Attachments

        1. jquery-2.0.3.js
          123 kB
          Daniel Goltz

        Issue Links

          Activity

            People

              Unassigned Unassigned
              hodavidhara David Hara
              Votes:
              1 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated: