[OFBIZ-10275] UtilCodec URL decoding breaks values with german umlauts - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: Trunk
Fix Version/s: 16.11.05, 17.12.01, 18.12.01
Component/s: framework
Labels:
None

Description

...and other UTF-8 characters encoded in two hex. values like in this example:

String example = "/webcontent/example_öl.jpg";
String encoded = UtilCodec.getEncoder("url").encode(example);
System.out.println(encoded);
=> "%2Fwebcontent%2Fexample_%C3%B6l.jpg"

String decoded = UtilCodec.getDecoder("url").decode(encoded); System.out.println(decoded);
=> "/webcontent/example_Ã¶l.jpg"

The reason for this is the OWASP ESAPI PercentCodec implementation used within the method UtilCodec.canonicalize, called before the proper decoding via java.net.URLDecoder here:

public String decode(String original) {
    try {
        String canonical = canonicalize(original);
        return URLDecoder.decode(canonical, "UTF-8");
    } catch (UnsupportedEncodingException ee) {
        Debug.logError(ee, module);
        return null;
    }
}

The fix could be to only use the canonicalize logic to check the original value for double/mixed encoding and to encode the original value afterwards via URLDecoder instead of using the canonicalize output for this.
This way the UrlCodec decode method matches the encode method by only using URLDecoder / URLEncoder for doing the main job.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

OFBIZ-10275_UrlCodec_decode_via_URLDecoder.patch
12/Mar/18 14:47
0.8 kB
Martin Becker

Issue Links

breaks

OFBIZ-11822 Double encoded urls are not being decoded

Closed

relates to

OFBIZ-12014 Error while decoding url parameters with percent character

Closed

Activity

People

Assignee:: Michael Brohl

Reporter:: Martin Becker

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 12/Mar/18 14:46

Updated:: 14/Sep/20 09:22

Resolved:: 12/Mar/18 21:51