Description
As reported by Andrzej [1], the HTMLParser crashes when the charset found in meta is illegal e.g.
<meta http-equiv="Content-Type" content="text/html; charset=ISO 8859-1"/>
[1] http://mail-archives.apache.org/mod_mbox/tika-user/201006.mbox/%3C4C2A102D.7090703@getopt.org%3E
Attachments
Attachments
Issue Links
- duplicates
-
TIKA-359 Calls to Charset.isSupported() will throw exceptions for invalid charset names
- Closed