Bug 16912 - c:param encodes URL with default URLEncoder.encode(str), while must with URLEncoder.encode(str, encoding)
Summary: c:param encodes URL with default URLEncoder.encode(str), while must with URLE...
Status: RESOLVED FIXED
Alias: None
Product: Taglibs
Classification: Unclassified
Component: Standard Taglib (show other bugs)
Version: 1.0
Hardware: All All
: P3 major with 3 votes (vote)
Target Milestone: ---
Assignee: Tomcat Developers Mailing List
URL:
Keywords:
: 19477 (view as bug list)
Depends on:
Blocks:
 
Reported: 2003-02-09 08:16 UTC by Vit Timchishin
Modified: 2004-11-16 19:05 UTC (History)
1 user (show)



Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Vit Timchishin 2003-02-09 08:16:23 UTC
Summary says the problem - it does not work well with cyrillic.
The fix is:
ParamSupport.java (common/core), line 123:
        try
        {
        if (encode) {
            parent.addParameter(
                URLEncoder.encode(name,
pageContext.getResponse().getCharacterEncoding()), URLEncoder.encode(value,
pageContext.getResponse().getCharacterEncoding()));
        } else
            parent.addParameter(name, value);
        }
        catch (java.io.UnsupportedEncodingException e)
        {throw new JspException(e.toString());}
Comment 1 Pierre Delisle 2003-02-26 22:54:06 UTC
Thanks for the bug report.

Fix is more elaborate than the one suggested because
URLEncoder.encode(String, String) is new since J2SE 1.4 and
JSTL 1.0 must also run on previous releases of J2SE.
Comment 2 Stefan Kuehnel 2003-02-27 15:14:13 UTC
From my understanding of the documentation for URLEncoder and the implementation 
notes of the HTML spec 
(http://www.w3.org/TR/html40/appendix/notes.html#non-ascii-chars), the parameter 
name and value should be encoded using UTF-8, not using the document encoding as 
the current fix does.  Looking at the Tomcat 4.x sources 
(http://cvs.apache.org/viewcvs.cgi/jakarta-tomcat-4.0/catalina/src/share/org/apa
che/catalina/connector/HttpRequestBase.java), it seems it uses the document 
encoding for parameter decoding, but shouldn't there at least be an option to 
specify the parameter encoding to use?
Comment 3 Vit Timchishin 2003-02-27 15:41:57 UTC
This would be more correct as soon as this would be parsed correctly by tomcat.
For now next test:
<c:out value="${param.param}"/>
<p><a href='test3.jsp?param=<%= java.net.URLEncoder.encode("&#1055;&#1088;&#1080;&#1074;&#1077;&#1090;", "UTF-8")
%>'>Click</a>

(file is test3.jsp) gives test3.jsp?param=%D0%9F%D1%80%D0%B8%D0%B2%D0%B5%D1%82
URL that is parsed correctly in Mozilla (displayed OK in status bar), but
incorrectly in Tomcat - &#1072;&#65533;&#1073;&#65533;&#1072;&#1048;&#1072;&#1042;&#1072;&#1045;&#1073;&#65533; is displayed by c:out instead of &#1055;&#1088;&#1080;&#1074;&#1077;&#1090;.
Note that the test3.jsp also has next statements (that allows me to use cyrillic
correctly):
<%@ page pageEncoding="windows-1251" %>
<% if (request.getCharacterEncoding() == null)
request.setCharacterEncoding(response.getCharacterEncoding());
%>
Comment 4 Dmitry Andrianov 2003-03-05 15:11:36 UTC
Tha way you fixed this bug will work on JDK 1.4 but will fallback to old 
behavior on JDK 1.3. In fact that means bug is not fixed.

It would be better to implement your own urlEncode implementation inside JSTL 
and use it on JDK 1.3. Simplest way is to use URLEncoder.encode source from 1.4
Comment 5 Pierre Delisle 2003-04-30 14:33:15 UTC
*** Bug 19477 has been marked as a duplicate of this bug. ***
Comment 6 Pierre Delisle 2003-04-30 23:01:52 UTC
Stefan is right about the HTML spec. However, this part of the HTML
spec was apparently produced too late to have an impact on
reality. Browsers generally encode the query string using the
character encoding of the page containing the form. Moreover,
the JSP 2.0 spec also adopts this convention for internally
generated query strings.

It therefore seems wise to follow suit with what everyone else is doing.

I've updated the code to do the encoding as follows:
   Util.URLEncode(name, enc)
where the URLEncode method has been lifted from the Jasper2 source code
(we now use the same code for both J2SE 1.3 and J2SE 1.4),
and where enc is 'pageContext.getResponse().getCharacterEncoding()'.