Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
Jena 3.17.0, Jena 4.1.0
-
None
Description
U+FFFD is the Unicode replacement character inserted when the input bytes can not be decoded.
For parsers based on RIOT and its text tokenizer, warn if this character is seen in strings, IRIs, blank node labels and prefix names.
If \uFFFD is seen, (escaped Unicode character), do not warn.
Attachments
Issue Links
- depends upon
-
JENA-2118 Change IO.asUTF8 to map bad characters to U+FFFD not throw an exception.
- Closed