Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-1110

Incorrectly declared SUPPORTED_TYPES in ChmParser.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 1.3, 1.4
    • 1.5
    • parser
    • None

    Description

      This link assigns the official mime type for these files to "application/vnd.ms-htmlhelp". In the wild there are also two other types used:

      • application/chm
      • application/x-chm

      tika-mimetypes.xml uses the correct official mime type, but ChmParser declares that it supports only "application/chm". For this reason content that uses the official mime type (e.g. coming via Detector or parsed using AutoDetectParser, or simply declared in metadata) fails to parse due to unknown mime type.

      The fix seems simple - ChmParser should declare also all of the above types in its SUPPORTED_TYPES.

      Attachments

        1. TIKA-1110.patch
          2 kB
          Vadim Roizman

        Activity

          People

            jukkaz Jukka Zitting
            ab Andrzej Bialecki
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: