I found a possible bug in the class org/apache/catalina/connector/Response.java. I try to explain the problem with steps of the request flow (I use a filter, a servlet and a JSP page): 1. Tomcat gets a request from outside 2. The filter gets the response and does the following: 2.1 Sets the character encoding to UTF-8 2.2 Gets the writer with response.getWriter() and writes out a message if the content type is text/html. This locks the writer which means you can not set the character encoding later anymore. 3 The servlet gets the request and an exception occurs (this is a simulated exception) 4 Tomcat gets back the request and processes the error 4.1 The response is reset: 4.1.1 the response itself is reset 4.1.2 the outputstream is reset 4.1.1+2 both are reset to ISO-8859-1 which is the default value. 5 The error page is called which has the encoding UTF-8. 5.1 BUG: The encoding of the page is not used because the writer is still locked but the encoding in the writer is set to default which is ISO-8859-1 My suggestion is to let the character encoding be untouched in the error case because in the error case the encoding was already set somewhere before (e.g. filter or servlet). 1. Tomcat (ISO-8859-1, writer unlocked) 2. Filter (-> UTF-8, writer locked) 3. Servlet (UTF-8, exception raised, writer locked) 4. Tomcat (ISO-8859-1, writer locked) 5. JSP (UTF-8 <-> ISO-8859-1 conflict because writer still locked -> setCharacterEncoding is locked) With regards Udo Walker
I understand your use case and your concern. But we can't count on the encoding being set somewhere before, can we? Even detecting that we're in the error case as opposed to a normal reset is somewhat challenging. If you've got a patch you want us to consider, please attach it to this issue.
In bug 36814 I first thought it is some other problem in Tomcat. Then I found out the problem described above. In bug 36814 I described a possible solution with a context parameter to set the default encoding of the container. The solution was denied :( . I don't know how to solve the encoding problem if nobody is able to configure the default encoding. You could still implement the default encoding as ISO-8859-1 but then if there is a context parameter set then use the encoding value described there.
How about the following corrections? org.apache.catalina.connector.Response: --- public void reset(int status, String message) { reset(); setStatus(status, message); usingWriter = false; // add for user error page } --- This makes the user error page be able to set encoding again. Even if there is already a generated Writer object, I think it has not been referred any longer usually because the application(filter, servlet, etc.) is already over. org.apache.catalina.valves.ErrorReportValve: in protected void report(Request request, Response response, Throwable throwable) ... try { response.setContentType("text/html"); response.setCharacterEncoding("utf-8"); // add for default error page if(!"utf-8".equals(response.getCharacterEncoding())){ response.getCoyoteResponse().setCharacterEncoding("utf-8"); } } catch (Throwable t) { ... If the writer object is already generated, setCharacterEncoding will not work. So I think we must force set encoding direct to coyote response. I know the specification says setCharacterEncoding should effect only before getWriter, and says nothing about getWriter in reset method description. But we need a fix in multi byte character environment.
-1 for the patch you suggest. It works for your case but won't work for many users that don't use UTF-8. I have started a thread on the dev list about this.
The mailing list thread is here: http://marc.info/?l=tomcat-dev&m=117280911532391&w=2
The fix in the duplicate allows the encoding of the error page to be completely independent of the original page. *** This bug has been marked as a duplicate of 43236 ***