If a mime type is
the value is incorrectly "UTF-8" not UTF-8
patch available at https://github.com/osi/tika/commit/b77814874ebff8f412ebb2f2adc52c6465d603c4
i have a CLA on file.
Nick Burch made changes -
|Status||Open [ 1 ]||Resolved [ 5 ]|
|Fix Version/s||1.1 [ 12318849 ]|
|Resolution||Fixed [ 1 ]|
peter royal made changes -
|Field||Original Value||New Value|
[ the rfc for mime isn't clear on whether single quotes make a valid quoted string. overall, the parser needs a bit more work to be fully rfc-compliant (quoted strings can have equals in them, for instance).
I was just trying to fix the simple case I came across. the java mail API generates quoted charset fields for text attachments, which is how I found this. ]