The highlighter provides a function to escape HTML, which does to much. To create valid HTML only ", <, >, & must be escaped, everything else can kept unescaped. The escaper unfortunately does also additionally escape everything > 127, which is unneeded if your web site has the correct encoding. It also produces huge amounts of HTML entities if used with eastern languages.
This would not be a bugf if the escaping would be correct, but it isn't, it escapes like that:
So it escapes not (as HTML needs) the unicode codepoint, instead it escapes the UTF-16 char, which is incorrect, e.g. for our all-time favourite Deseret:
U+10400 (deseret capital letter long i) would be escaped as �� and not as 𐐀.
So we should remove the stupid encoding of chars > 127 which is simply useless