I use following construction to set request character encoding: if (request.getCharacterEncoding() == null) { request.setCharacterEncoding("ISO8859-2"); } In older versions of Tomcat (5.0.7 tested) everything works fine but in version 5.0.12 this construction changes nothing - ISO8859-2 characters in parameter values are replaced with "?". $ java -version java version "1.4.2" Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.2-b28) Java HotSpot(TM) Client VM (build 1.4.2-b28, mixed mode)
I doubt there's actually a bug with this functionality. The query string character encoding handling did change, but it was previously broken. If you want i18n in a portable fashion, use a POST.
Sorry, but I don't understand. Is there a bug in this or not? If it is, why did you mark this report as invalid? If not, what I have to do to make it work in my configuration? I don't want to use POST, because HTTP RFC says when to use GET and when POST and I'm convinced that in my situation GET is better option and should be used.
Sorry, there's no bug. BZ is not there to discuss design decisions. If you want to do so, post on tomcat-dev. The only standard for URL encoding is to use UTF-8, but nobody follows the standard. You can also now configure the URI encoding in the connector. If you insist on using i18n with URL parameters, the result is that it won't work reliably, but of course, you're free to do what you want ;-) Please do not reopen the report.
*** Bug 25848 has been marked as a duplicate of this bug. ***
*** Bug 25958 has been marked as a duplicate of this bug. ***
From Mark: Character encoding has been the source of quite a bit of debate on the tomcat- dev list in recent weeks. There have been a few changes (see summary below) as a result. Essentially some additional configuration options have been provided. The UTF-8 issue (also reported in bug 22666) has also been fixed. Character encoding summary ========================== There are a number of situations where there may be a requirement to use non- US ASCII characters in a URI. These include: - Parameters in the query string - Servlet paths There is a standard for encoding URIs (http://www.w3.org/International/O-URL- code.html) but this standard is not consistently followed by clients. This causes a number of problems. The functionality provided by Tomcat (4 and 5) to handle this less than ideal situation is described below. 1. The Coyote HTTP/1.1 connector has a useBodyEncodingForURI attribute which if set to true will use the request body encoding to decode the URI query parameters. - The default value is true for TC4 (breaks spec but gives consistent behaviour across TC4 versions) - The default value is false for TC5 (spec compliant but there may be migration issues for some apps) 2. The Coyote HTTP/1.1 connector has a URIEncoding attribute which defaults to ISO-8859-1. 3. The parameters class (o.a.t.u.http.Parameters) has a QueryStringEncoding field which defaults to the URIEncoding. It must be set before the parameters are parsed to have an effect. Things to note regarding the servlet API: 1. HttpServletRequest.setCharacterEncoding() normally only applies to the request body NOT the URI. 2. HttpServletRequest.getPathInfo() is decoded by the web container. 3. HttpServletRequest.getRequestURI() is not decoded by container. Other tips: 1. Use POST with forms to return parameters as the parameters are then part of the request body.
*** Bug 26118 has been marked as a duplicate of this bug. ***
*** Bug 26393 has been marked as a duplicate of this bug. ***
*** Bug 30255 has been marked as a duplicate of this bug. ***