Details

    • Type: Bug
    • Status: Closed
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 1.5.1
    • Fix Version/s: 1.9
    • Component/s: parser
    • Labels:
      None

      Description

      (reported by Jan Riewe, see http://lucene.472066.n3.nabble.com/CHM-Files-and-Tika-td3999735.html)

      Nutch fails to parse chm files with

      ERROR tika.TikaParser - Can't retrieve Tika parser for mime-type application/vnd.ms-htmlhelp

      Tested with chm test files from Tika:

       % bin/nutch parsechecker file:/.../tika/trunk/tika-parsers/src/test/resources/test-documents/testChm.chm
      

      Tika parses this document (but does not extract any content).

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                wastl-nagel Sebastian Nagel
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: