Bug 52579 - Tomcat5.5.35+Java1.5 cannot return proper value of a request parameter
Tomcat5.5.35+Java1.5 cannot return proper value of a request parameter
Status: RESOLVED WONTFIX
Product: Tomcat 5
Classification: Unclassified
Component: Connector:HTTP
5.5.35
All All
: P2 regression (vote)
: ---
Assigned To: Tomcat Developers Mailing List
:
Depends on:
Blocks:
  Show dependency tree
 
Reported: 2012-02-02 10:03 UTC by Hiroki Hayashi
Modified: 2012-10-01 08:12 UTC (History)
1 user (show)



Attachments
JSP file to reproduce the matter (471 bytes, text/plain)
2012-02-02 10:03 UTC, Hiroki Hayashi
Details
Test.java - Test Charset.decode() (877 bytes, text/plain)
2012-02-02 11:00 UTC, Konstantin Kolinko
Details
new implementation of ByteChunk.toStringInternal() (1.75 KB, patch)
2012-02-03 09:32 UTC, Keiichi Fujino
Details | Diff
patch v2 (3.23 KB, text/plain)
2012-02-06 08:20 UTC, Keiichi Fujino
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Hiroki Hayashi 2012-02-02 10:03:26 UTC
Created attachment 28251 [details]
JSP file to reproduce the matter

(1) Overview

When I install Tomcat5.5.35+jdk1.5.0_22 and run the JSP(please see the attached document),
I cannot get proper value of a request parameter.

I enter multibyte character (e.g. 10 or aa) into the textbox of the JSP,
it runs correcly and i can get the input value (e.g. 10 or aa).
But I enter 1 byte character (e.g. "1" or "a"),
it runs incorrectly and i can get nothing.

Please advise me.
(Our customers are also waiting for the reason.)
Thank you.


(2) Steps to Reproduce
[2-1] Install Tomcat5.5.35+jdk1.5.0_22
[2-2] Deploy the JSP file in the following directory.
      /apache-tomcat-5.5.35/webapps/jsp-examples
[2-3] Enter the 1 byte character (e.g. "1" or "0") to the textbox and push ok button.


(3) Actual Results
The "message" shows nothing.


(4) Expected Results
The "message" shows the input character.


(5) Build Date & Platform
Build 2012-02-02 on Windows7
(I suppose it does not depend on the Platform.)


(6) Additional Information
Tomcat5.5.34+jdk1.5.0_22 runs correctly.
So the following codes may be the reason:

---
org.apache.tomcat.util.buf.ByteChunk.toStringInternal()

# Line514

CharBuffer cb;
cb = charset.decode(ByteBuffer.wrap(buff, start, end-start));
return new String(cb.array(), cb.arrayOffset(), cb.length());
---
Comment 1 Konstantin Kolinko 2012-02-02 10:37:21 UTC
Similar recent discussion on users@:
("POST data (single character) cleared when using tomcat 6.0.33 and Character Encoding Filter")
http://marc.info/?t=132668010800001&r=1&w=2
http://markmail.org/message/o7l2p7ve5cpswnzl

You stumbled upon bug in charset implementation in Java 1.5:
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6196991
Comment 2 Konstantin Kolinko 2012-02-02 11:00:11 UTC
Created attachment 28252 [details]
Test.java - Test Charset.decode()

I am attaching a test class that I wrote based on reproduction scenario in bug 6196991 + charset enumeration code from r1140904.

This test prints names of charsets that cannot perform encoding+decoding roundtrip for single "A" character.


Here is the list of charsets that are affected by this issue,
tested with 1.5.0_20-b02, on Windows:
---
Big5
Big5-HKSCS
EUC-JP
EUC-KR
GB2312
GBK
ISO-2022-JP
JIS_X0212-1990
Shift_JIS
windows-31j
+ two dozens of non-standard charsets whose names start with "x-"
---

With 1.4.2_19-b04 on Windows the list is the same less GB2312 which is absent.

With 1.6.0_30-b12 on Windows the list contains this only charset:
----
JIS_X0212-1990
+ 4 non-standard charsets whose names start with "x-"
----

So:
1. The issue is indeed a bug in JRE.

It is present in latest public versions of 1.4 and 1.5 that I have. I do not know anything about later "Java for business" versions.

2. The issue is absent in Oracle/Sun JDK 1.6.30.

3. The issue affects only certain encodings.

If you can update your configuration and applications to use UTF-8, you would avoid this issue.
Comment 3 Keiichi Fujino 2012-02-03 09:32:27 UTC
Created attachment 28257 [details]
new implementation of ByteChunk.toStringInternal()

Hi All.

I am using Charaset affected by this issue.
Although I know this is a issue in Java, 
I propose  new implementation of ByteChunk.toStringInternal().

I will propose to STATUS.txt. (both 5.5.x and 6.0.x)
Comment 4 Hiroki Hayashi 2012-02-03 10:22:20 UTC
Thank you very much for the answer, Mr. Kolinko.
And Thank you for the patch to the issue, Mr. Fujino.

I tried to run the program from Mr. Kolinko,
and could get the "Broken charset" like Shift_JIS.

I could understand that the issue is a bug in JRE,
and it is sure that the support limitation of Java5 was over.
Thank you, sir.

On the other hand, there is a message"Tomcat5.5.x requires 5.0 or later"on the page.
http://tomcat.apache.org/tomcat-5.5-doc/building.html#Download_and_install_a_Java_Development_Kit_1.4.x_or_later

So, We hope to get the patch to the program.

Thank you very much.
Comment 5 Konstantin Kolinko 2012-02-04 00:46:39 UTC
(In reply to comment #3)
> Created attachment 28257 [details]
> new implementation of ByteChunk.toStringInternal()
>

-1. There are two errors:

1) "return new String(buff, start, end-start);" is just wrong. It converts bytes to String using OS default encoding.

As far as I understand the "result.isUnderflow()" condition means that all input data has been processed. This "return new String" code just handles an unexpected state.

I suggest to replace that code by  "cr.throwException();".

2) "charset.newDecoder()" is expected to be an expensive operation. In scenario of CVE-2012-0022 I expect it to have notable impact on performance.

Charset.decode() uses a ThreadLocal-based cache of decoders. Maybe we can implement something like that cache, or just use a simple ThreadLocal (or other way) to pass a Decoder instance around while processing the same request.
Comment 6 Konstantin Kolinko 2012-02-04 01:29:57 UTC
(In reply to comment #5)
> Maybe we can 
> implement (...) just use a simple ThreadLocal
> to pass a Decoder instance around while processing the same request.

If a Decoder instance is obtained from a ThreadLocal a quick way to test it against required charset is to compare it with decoder.charset().


3) For large input data the current implementation that calls Charset.decode() is better than the proposed one, because it allocates less memory. The difference is between (size * averageCharsPerByte()) and (size * maxCharsPerByte()).

I think threshold can be around 10 bytes.

The Java bug #6196991 occurs when the value of (input size * decoder.averageCharsPerByte()) coerced to integer is 0.  In this case in Java 5 the CharsetDecoder#decode(ByteBuffer) method erroneously treats it as if no input data were available. If input is > 10 bytes it should not trigger the bug #6196991.
Comment 7 Keiichi Fujino 2012-02-06 08:20:18 UTC
Created attachment 28274 [details]
patch v2

Many thanks for the comments.

I reimplement ByteChunk.toStringInternal().

> I suggest to replace that code by  "cr.throwException();".

The code was replaced by result.throwException(). 
CharacterCodingException is thrown as RuntimeException. 

> Charset.decode() uses a ThreadLocal-based cache of decoders. Maybe we can
> implement something like that cache, or just use a simple ThreadLocal (or other
> way) to pass a Decoder instance around while processing the same request.

Cache of Decoder was created using simple ThreadLocal.
This cache is very simple now. 
Only one Decoder instance is always cached.
If you would like to cache two or more Decoder instances, it is necessary to refactor. 
In that case, a code will become complicated to a slight degree. 

> 3) For large input data the current implementation that calls Charset.decode()
> is better than the proposed one, because it allocates less memory. The
> difference is between (size * averageCharsPerByte()) and (size *
> maxCharsPerByte()).
>
> I think threshold can be around 10 bytes.

The threshold value was added.
Comment 8 Mark Thomas 2012-02-17 18:29:36 UTC
I am still leaning heavily towards WONTFIX for this.

This issue affects a version of the JVM where fixes are no longer provided for free by Oracle. Users of such a JVM have two options:
1. Upgrade to a JVM release (minimum 1.6) where this is fixed and Oracle continue to make fixes freely available.
2. Pay for Oracle support.

I am extremely reluctant to start adding significant chunks of code into what is a very old Tomcat release in order to work around a bug in a JVM that no-one should be using unless they are paying for support.
Comment 9 Mark Thomas 2012-10-01 08:12:22 UTC
I am vetoing this proposed fix. My reasons are set out in comment #8 above.