Tim/Peter,
Having just worked in this area on another piece of software, I have a few code review comments on this patch:
(1) In the readString method, the transformation from UTF8 to Unicode should be limited to 1 byte (max value 255) since the UTF8 data is decoded in this method into a byte array. Decoding values above 255 and stuffing the value into a byte value will only cause debugging problems down then road when receiving UTF8 data with values above 255. Thus, as with the code before patch was applied (with values above 127) , the readString method should throw an IO Exception if a Unicode value greater than 255 is encountered indicating "Encoding is not supported".
(2) For performance reasons (although not much of a factor with 1 byte Unicode, however greater factor in supporting 2 byte and 4 byte Unicode), bitwise operators should be used to decode / encode between UTF8 and Unicode rather than arithmetic.
(3) The "null character", value 0, should not be skipped. It should be treated as a character and decoded / endcoded along with the rest of the characters. The null character is a valid value in UTF8 and Unicode (and c++ std::string's). The null character is a C style string programming artifact.
my two cents, thanks,
Martin.
Hi,
here is the changed src\main\activemq\connector\openwire\utils\OpenwireStringSupport.cpp.
Peter