Uploaded image for project: 'Tika'
  1. Tika
  2. TIKA-1041

Tika 1.2 universalcharset errors

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.2
    • 1.3
    • None
    • None
    • I'm running solr 4.0 with tika 1.2 on tomcat 7.0.8 with manifoldcf v1.1dev

    Description

      This is somewhat confusing and frustrating. I successfully crawled Opentext using all of the above. then I recrawled and it aborted almost immediately.
      It choked on images, so I excluded them for now.
      but now it's choking on txt files!
      sometimes I get this error
      SEVERE: null:java.lang.RuntimeException: java.lang.NoClassDefFoundError: org/mozilla/universalchardet/CharsetListener

      and sometimes I get this one
      SEVERE: null:java.lang.RuntimeException: java.lang.NoClassDefFoundError: org/apache/tika/parser/txt/UniversalEncodingListener

      Attachments

        Activity

          People

            jukkaz Jukka Zitting
            dmorana David Morana
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: