Description
As reported on user@ in "non-West European languages support":
http://mail-archives.apache.org/mod_mbox/tika-user/201107.mbox/%3COF0C0A3275.DA7810E9-ONC22578CC.0051EEDE-C22578CC.0052548B@il.ibm.com%3E
The RTF Parser seems to be doubling up some non-european characters
Attachments
Attachments
Issue Links
- is related to
-
TIKA-4297 Diuplicate Map keys in TextExtractor
- Open