Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-1141

javascript files that contain "<html" are detected as text/html

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Open
    • Priority: Minor
    • Resolution: Unresolved
    • Affects Version/s: 1.2
    • Fix Version/s: None
    • Component/s: mime
    • Labels:
      None

      Description

      The Mimetypes detector will return text/html as the mimetype for any javascript file that contains the string "<html" in it. I believe this is due to the rule <match value="<html" type="string" offset="0:8192"/> in the tika-mimetypes.xml file.

        Attachments

        1. jquery-2.0.3.js
          123 kB
          Daniel Goltz

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              hodavidhara David Hara
            • Votes:
              4 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

              • Created:
                Updated: