Description
We should add some common tokens files for popular languages for tika-eval so that users don't have to generate their own.
Attachments
Issue Links
- is depended upon by
-
TIKA-2038 A more accurate facility for detecting Charset Encoding of HTML documents
- Open