[LANG-507] StringEscapeUtils.unescapeJava should support \u+ notation - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Closed
Priority: Trivial
Resolution: Fixed
Affects Version/s: 2.4
Fix Version/s: 3.0
Component/s: lang.*
Labels:
None

Description

Currently, when trying to unescape a String with Unicode escapes in the common notation, e.g., \u+0022, I get a NumberFormatException:

org.apache.commons.lang.exception.NestableRuntimeException: Unable to parse unicode value: +002

Note that the number is also parsed incorrectly as it is shortened by one character (obviously, the parser gets confused by the '+' and only takes up to 4 bytes, so it neglects the last digit).

I am aware that in Java, Unicode is escaped as "\u" followed by 4 bytes that represent the hex code in the Unicode map, but the \u+ notation is commonly used outside the Java world and it would be very handy if StringEscapeUtils supported that, at least as an option.

Would you please consider adding this feature to 3.0?

Attachments

Issue Links

relates to

LANG-505 Rewrite StringEscapeUtils

Closed

Activity

People

Assignee:: Unassigned

Reporter:: Gregor B. Rosenauer

Votes:: 0 Vote for this issue

Watchers:: 0 Start watching this issue

Dates

Created:: 05/Jun/09 14:37

Updated:: 17/Dec/09 03:41

Resolved:: 18/Oct/09 07:26