[CAMEL-14521] Unicode problem in Bindy component for fixed length data - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 3.1.0
Component/s: camel-bindy
Labels:
- pull-request-available
Environment:

JDK: openjdk-8-jdk Version 8u242-b08-0ubuntu3~18.04 on Ubuntu 18.04 amd64

The ICU4J library was used for processing Unicode correctly: See dependencies in POM

Patch Info:

Patch Available
Estimated Complexity:
Moderate
Flags:

Patch

Description

Hi,

AFAIK all versions of came are affected by the following bug: Camel counts the chars in the fixed length data format wrongly.

Unicode is a bit tricky, when it comes to counting the length of a string specially since Java uses internally UTF-16, which means depending on the codepoint 1 - 2 (Java-)chars. Bindy seems to use internally for selection substring and counts chars like Java does. This means the length of a string is the count of the chars, i.e. UTF-16 surrogates, but not codepoints, which is the common denominator (e.g. see definition of string length in XMLSchema). And when one takes combing chars into account (one "base char" plus 0 - n combining chars are perceived as one "char" by users) it becomes even more of a problem.

Fixed length data format is totally dependent on counting chars correctly, which makes it unsuable if the chars are not correctly counted, since it cannot recover for "colums" to the right.

See also the mailing list at http://mail-archives.apache.org/mod_mbox/camel-users/202001.mbox/browser

As suggested I created a pull request, since this may be of some interest for the community. The ICU4J lib was used, for processing Unicode correctly, since the functionality built into the Java API is too old to process modern emojis (skin colour, hair, sex) correctly. Please watch the license...

Pull-request: https://github.com/apache/camel/pull/3552

Attachments

Activity

People

Assignee:: Claus Ibsen

Reporter:: Michael Greulich

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 07/Feb/20 13:59

Updated:: 11/Feb/20 09:22

Resolved:: 11/Feb/20 09:22