[AMQCPP-232] OpenWire encode and decode UTF8 incorrect - ASF JIRA

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 2.2.5
Fix Version/s: 2.2.6, 3.0
Component/s: Openwire
Labels:
None
Environment:

Windows XP SP 3, Visual Studio 2008

Regression:

Regression

Description

Hallo,

we are using topic messages to sent messages from one user to another. Our program subscribe a durable consumer with selector "UserName='<user>'" and send a message with the property "UserName" and value "<user>".

All works fine, when <user> contains only ASCII characters. When <user> contains non ASCII characters like äöüßé, the message is not send to the
consumer.

The problem ist that readString and writeString in OpenwireStringSupport.cpp have bugs

Regards,
Peter

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

OpenwireStringSupportTest.patch
03/Apr/09 16:12
4 kB
Martin Schlapfer
OpenwireStringSupport.patch
26/Mar/09 20:23
6 kB
Peter Pfort
OpenwireStringSupport_fixed.patch
03/Apr/09 02:56
7 kB
Martin Schlapfer

Activity

Ascending order - Click to sort in descending order

Peter Pfort added a comment - 26/Mar/09 15:29

Hi,

here is the changed src\main\activemq\connector\openwire\utils\OpenwireStringSupport.cpp.

Peter

Peter Pfort added a comment - 26/Mar/09 15:29 Hi, here is the changed src\main\activemq\connector\openwire\utils\OpenwireStringSupport.cpp. Peter

Timothy A. Bish added a comment - 26/Mar/09 17:38

When attaching a fix its much nicer if you could attach a patch file as its easier to apply to the different branches of code, also I can't do anything until you resubmit your patch and select to grant the code to the ASF license.

Timothy A. Bish added a comment - 26/Mar/09 17:38 When attaching a fix its much nicer if you could attach a patch file as its easier to apply to the different branches of code, also I can't do anything until you resubmit your patch and select to grant the code to the ASF license.

Timothy A. Bish added a comment - 27/Mar/09 00:29

Patch applied, thanks!

Timothy A. Bish added a comment - 27/Mar/09 00:29 Patch applied, thanks!

Martin Schlapfer added a comment - 30/Mar/09 23:25

Tim/Peter,

Having just worked in this area on another piece of software, I have a few code review comments on this patch:

(1) In the readString method, the transformation from UTF8 to Unicode should be limited to 1 byte (max value 255) since the UTF8 data is decoded in this method into a byte array. Decoding values above 255 and stuffing the value into a byte value will only cause debugging problems down then road when receiving UTF8 data with values above 255. Thus, as with the code before patch was applied (with values above 127) , the readString method should throw an IO Exception if a Unicode value greater than 255 is encountered indicating "Encoding is not supported".

(2) For performance reasons (although not much of a factor with 1 byte Unicode, however greater factor in supporting 2 byte and 4 byte Unicode), bitwise operators should be used to decode / encode between UTF8 and Unicode rather than arithmetic.

(3) The "null character", value 0, should not be skipped. It should be treated as a character and decoded / endcoded along with the rest of the characters. The null character is a valid value in UTF8 and Unicode (and c++ std::string's). The null character is a C style string programming artifact.

my two cents, thanks,
Martin.

Martin Schlapfer added a comment - 30/Mar/09 23:25 Tim/Peter, Having just worked in this area on another piece of software, I have a few code review comments on this patch: (1) In the readString method, the transformation from UTF8 to Unicode should be limited to 1 byte (max value 255) since the UTF8 data is decoded in this method into a byte array. Decoding values above 255 and stuffing the value into a byte value will only cause debugging problems down then road when receiving UTF8 data with values above 255. Thus, as with the code before patch was applied (with values above 127) , the readString method should throw an IO Exception if a Unicode value greater than 255 is encountered indicating "Encoding is not supported". (2) For performance reasons (although not much of a factor with 1 byte Unicode, however greater factor in supporting 2 byte and 4 byte Unicode), bitwise operators should be used to decode / encode between UTF8 and Unicode rather than arithmetic. (3) The "null character", value 0, should not be skipped. It should be treated as a character and decoded / endcoded along with the rest of the characters. The null character is a valid value in UTF8 and Unicode (and c++ std::string's). The null character is a C style string programming artifact. my two cents, thanks, Martin.

Timothy A. Bish added a comment - 30/Mar/09 23:33

We welcome additional patches.

Timothy A. Bish added a comment - 30/Mar/09 23:33 We welcome additional patches.

Martin Schlapfer added a comment - 02/Apr/09 21:36

Reopening to submit patch as mentioned in previous comment.

Martin Schlapfer added a comment - 02/Apr/09 21:36 Reopening to submit patch as mentioned in previous comment.

Martin Schlapfer added a comment - 02/Apr/09 21:37

Updated patch file, per previous comments.

Martin Schlapfer added a comment - 02/Apr/09 21:37 Updated patch file, per previous comments.

Martin Schlapfer added a comment - 03/Apr/09 02:56

Found a problem with readString during testing of encodings containing 1-byte AND 2-byte sequences. For a 1-byte sequence the value was not placed back in the value array. Please use this updated patch instead of the previous I had attached. Thanks.

Martin Schlapfer added a comment - 03/Apr/09 02:56 Found a problem with readString during testing of encodings containing 1-byte AND 2-byte sequences. For a 1-byte sequence the value was not placed back in the value array. Please use this updated patch instead of the previous I had attached. Thanks.

Timothy A. Bish added a comment - 03/Apr/09 03:01

I won't get to this until tomorrow or sat, so feel free to augment the existing CPPUnit test for the code and attach that patch as well.

Thanks for the contribution.

Timothy A. Bish added a comment - 03/Apr/09 03:01 I won't get to this until tomorrow or sat, so feel free to augment the existing CPPUnit test for the code and attach that patch as well. Thanks for the contribution.

Martin Schlapfer added a comment - 03/Apr/09 16:12

Updated cppunit test case for OpenwireStringSupportTest to support patch to OpenwireStringSupport.

Martin Schlapfer added a comment - 03/Apr/09 16:12 Updated cppunit test case for OpenwireStringSupportTest to support patch to OpenwireStringSupport.

Timothy A. Bish added a comment - 04/Apr/09 00:33

I've added the patches to the trunk code today, I'll work on adding it to the 2.x branch tomorrow. Great Work! Thanks for the patches. keep them coming

I also added a small benchmark test for the OpenWireStringSupport class to the benchmarks suite for performance comparisons.

Timothy A. Bish added a comment - 04/Apr/09 00:33 I've added the patches to the trunk code today, I'll work on adding it to the 2.x branch tomorrow. Great Work! Thanks for the patches. keep them coming I also added a small benchmark test for the OpenWireStringSupport class to the benchmarks suite for performance comparisons.

Timothy A. Bish added a comment - 04/Apr/09 14:49

Resolved, patches added to trunk and 2.x

Timothy A. Bish added a comment - 04/Apr/09 14:49 Resolved, patches added to trunk and 2.x

People

Assignee:: Timothy A. Bish

Reporter:: Peter Pfort

Votes:: 0 Vote for this issue

Watchers:: 0 Start watching this issue

Dates

Created:: 26/Mar/09 15:26

Updated:: 04/Apr/09 14:49

Resolved:: 04/Apr/09 14:49

ActiveMQ C++ Client

Details

Description

Attachments

Attachments

Activity

People

Dates