We use maven-site-plugin to generate our documentation site (source is xdoc) and noticed that non-ASCII characters like Japanese or Chinese in table caption are not correctly displayed in the output html files.
During generation, these characters are encoded to entities (e.g. '和' ) and are displayed correctly in the browser.
However, in a table caption, the first '&' is escaped into '&'.
So, for example, the actual output becomes '&#x548c;' while the expected output is '和'.
To verify the issue, modify org.apache.maven.doxia.sink.impl.XhtmlBaseSinkTest.testTableCaption() as follows.
Not sure if this is a proper fix, but I modified the following line in org.apache.maven.doxia.sink.impl.XhtmlBaseSink.write(String) ...
... to ...
... and the issue was resolved without breaking existing tests.
I have attached the above modification as a patch.
Please let me know if you prefer a PR on GitHub or need more info.
Thanks in advance,