Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-1122

Tika fails to parse chm files

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Duplicate
    • 1.3
    • None
    • parser
    • None

    Description

      (reported by Jan Riewe over nutch user group, see http://lucene.472066.n3.nabble.com/CHM-Files-and-Tika-td3999735.html)
      Nutch fails to parse chm files with
      ERROR tika.TikaParser - Can't retrieve Tika parser for mime-type application/vnd.ms-htmlhelp

      Even after running tika-app in standalone manner (ie. not via nutch), I could see not even a single chm file being parsed (I tried with 10-15 different chm files of variable sizes).

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              tejasp Tejas Patil
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: