The stacktrace is related to my original problem, but actually shows an inconsistency in POI's handling of UnsupportedEncodingException. POI has a try-catch block for that exception only on the first choice for guessing 7 bit encoding. The second and third choice take whatever value could be pulled out of the header or the html meta-equiv and set7BitEncoding(charset) without the try-catch block.
Turns out another problem is that, of course, Charset.forName() can throw an UnsupportedCharsetException (not UnsupportedEncodingException)...so that's not even checked for in POI's code. And, while we're defending against trying to create a charset from whatever value we find in msg/html headers or codepoint values, we should also add IllegalCharsetName in the catch block...or just go for IllegalArgumentException and be done with it.
As an immediate fix at the Tika level, we can duplicate POI's guess7BitEncoding but add the try-catch blocks. I'll open an issue in POI's bugtracker, though, to fix this at the POI level too.
Test files will be very helpful. If you can share, please do.