Details
-
Bug
-
Status: Closed
-
Minor
-
Resolution: Fixed
-
2.5
-
None
-
java version "1.6.0_16"
Java(TM) SE Runtime Environment (build 1.6.0_16-b01)
Java HotSpot(TM) 64-Bit Server VM (build 14.2-b01, mixed mode)Microsoft Windows [Version 6.0.6002]
Apache Maven 2.2.1 (r801777; 2009-08-06 12:16:01-0700)
Java version: 1.6.0_16
Java home: C:\Program Files\Java\jdk1.6.0_16\jre
Default locale: en_US, platform encoding: Cp1252
OS name: "windows vista" version: "6.0" arch: "amd64" Family: "windows"java version "1.6.0_16" Java(TM) SE Runtime Environment (build 1.6.0_16-b01) Java HotSpot(TM) 64-Bit Server VM (build 14.2-b01, mixed mode) Microsoft Windows [Version 6.0.6002] Apache Maven 2.2.1 (r801777; 2009-08-06 12:16:01-0700) Java version: 1.6.0_16 Java home: C:\Program Files\Java\jdk1.6.0_16\jre Default locale: en_US, platform encoding: Cp1252 OS name: "windows vista" version: "6.0" arch: "amd64" Family: "windows"
Description
StringUtils.containsAny methods incorrectly matches Unicode 2.0+ supplementary characters.
For example, define a test fixture to be the Unicode character U+20000 where U+20000 is written in Java source as "\uD840\uDC00"
private static final String CharU20000 = "\uD840\uDC00";
private static final String CharU20001 = "\uD840\uDC01";
You can see Unicode supplementary characters correctly implemented in the JRE call:
assertEquals(-1, CharU20000.indexOf(CharU20001));
But this is broken:
assertEquals(false, StringUtils.containsAny(CharU20000, CharU20001));
assertEquals(false, StringUtils.containsAny(CharU20001, CharU20000));
This is fine:
assertEquals(true, StringUtils.contains(CharU20000 + CharU20001, CharU20000));
assertEquals(true, StringUtils.contains(CharU20000 + CharU20001, CharU20001));
assertEquals(true, StringUtils.contains(CharU20000, CharU20000));
assertEquals(false, StringUtils.contains(CharU20000, CharU20001));
because the method calls the JRE to perform the match.
More than you want to know: