Bug 28966 - JSP pages with UTF-8 characters always displays as ISO-8859-1
Summary: JSP pages with UTF-8 characters always displays as ISO-8859-1
Status: RESOLVED WONTFIX
Alias: None
Product: Tomcat 5
Classification: Unclassified
Component: Unknown (show other bugs)
Version: 5.0.23
Hardware: PC Linux
: P3 normal (vote)
Target Milestone: ---
Assignee: Tomcat Developers Mailing List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2004-05-14 02:29 UTC by John Constable
Modified: 2008-12-18 21:02 UTC (History)
1 user (show)



Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description John Constable 2004-05-14 02:29:09 UTC
I have searched the web thoroughly for days now, as well as the bug database,
and I am convinced this problem isn't posted anywhere.

I have a website that talks to an Oracle database.  The Oracle database stores
most strings in UTF-8 format.  Everything works fine when Tomcat is run on
Win2k, XP, Mac OSX, and Solaris 8, but when I try to run the application on
Linux, the UTF-8 characters display as if ISO-8859-1 is being used.

I have tried the following fixes:
added the charset filter that everyone speaks of to web.xml
set environment variables such as JAVA_OPTS=-Dfile.encoding="UTF-8", LANG=UTF-8
and LANG=us_US.UTF-8

Even with these changes, I still get the ISO representation.

Adding <%@ page contentType= "text/html;charset=UTF-8" pageEncoding= "UTF-8" %>
to the JSPs forced the encoding in the HTTP headers to UTF-8 (they were by
default ISO_8859-1), but the result was identical.  On all the other platforms,
the encoding is already UTF-8 before any of these changes.

Now here is the part that really blows my mind.  If I save the JSP from another
server (with correct UTF-8 encoding), and put it on my server as an .html file,
Tomcat serves it up perfectly.  Headers are right, text displays, all is well. 
If I rename this to a .jsp, once again, the problem returns.

For this reason I am sure it isnt the DB.  Is there something in Jasper maybe
that could be imposing an incorrect charset onto files in Linux?  Once again,
this problem only seems to exist on Linux, in both AS 2.4, as well as the newest
Fedora.

Thanks!

John
Comment 1 Yoav Shapira 2004-05-28 18:12:27 UTC
I'm not familiar enough with the relevant code to offer a solution, but I do 
have a suggestion: send your query to the tomcat-user list and see if others 
can provide feedback.  If they do, maybe we can narrow this down to a specific 
location in the code, and then amend this bug report to include that specific 
information.
Comment 2 Yoav Shapira 2004-06-16 14:42:00 UTC
Have you tried experimenting with the javaEncoding attribute of the JSPServlet 
(or the JspC compiler if you use that)?
Comment 3 Kin-Man Chung 2004-06-18 20:39:45 UTC
Unless you can provide me with an test case that clearly demonstrates the
problem, there's nothing I can do about it.