Non-ASCII passwords, in both forms and basic interfaces, are converted to UTF-8 bytes with each UTF-8 byte in a separate Unicode character in the password string. We are using a custom realm, but we expect this behaviour would be consistent across all the realms. A logical implementation would be to map the UTF-8 character to the equivalent Unicode character before presenting the Unicode password String in the interface. This is similar to Tomcat 5 bug 29091.
As bug 29091 states, this is not an easy fix.
Quick update: The fix (using the filter) described in 29091 appears to resolve the issues with the admin app (passwords are saved correctly in UTF-8) but a much trickier problem has emerged in testing. For BASIC auth the password is converted to bytes and base 64 encoded. The problem appears to be the different browsers (at least IE and FireFox) make different encoding assumptions (and neither seem to assume UTF-8) at this point because the same username and password results in different Authorization headers. It is looking like another i18n grey area but I will do some more work to see if there is anything that can be done to work around this fun and games.
Yep. BASIC auth and non-ASCII passwords is a mess before it even gets to Tomcat. Mozilla definitely (and I suspect IE as well) does a lossy conversion of non-ASCII usernames and passwords before base64 encoding. There is no way I can see of Tomcat supporting BASIC auth for non-ASCII usernames and passwords as things currently stand. On to FORM auth...
FORM and DIGEST required a few small fixes - these have been applied to CVS for TC4 and TC5. Remember that for editing of these to work correctly via the admin app, the SetCharacterEncodingFilter must be configured.
The encoding used to interpret the login request can be forced (in TC5 at least) using the characterEncoding attribute of the org.apache.catalina.authenticator.FormAuthenticator valve - http://tomcat.apache.org/tomcat-5.5-doc/config/valve.html