Uploaded image for project: 'ActiveMQ C++ Client'
  1. ActiveMQ C++ Client
  2. AMQCPP-261

Handle Multibyte Strings or Strings encoded in Charsets other than US-ASCII

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.0.1
    • 3.2.0
    • CMS Impl, Decaf, Openwire
    • None

    Description

      The CMS API defines the interface for Strings in the TextMessage using the C++ std::string and const char* primitives and doesn't consider character encodings in its interface or the use of multibyte string representations.

      In order to allow the use of Strings between Java and C++ and .NET clients the strings in the TextMessage as well as those in MapMessage, StreamMessage, and BytesMessage (when wreiteUTF and readUTF are called) as well as message properties of the string type are encoded in the JAVA standard Modified UTF-8 format for serialized strings. This design makes the assumption that strings passed are in US-ASCII format and that the strings from the broker are also encoded with no char values greater than 255 and throws an exception if one is encountered.

      The CMS interface needs to be extended to allow for more flexible string handling and offer a mechanism to deal with string encodings other than ASCII.

      Another alternative is to change the assumption about strings in the CMS API to assume that all string are given as either ASCII strings with chars < 127 and no embedded nulls or are already encoded by the user as Modified UTF-8 by the user so that a Java or .NET client can read all strings sent in CMS Messages as well.

      Attachments

        Activity

          I think the resolution to this issue will be to remove all string encoding from the C++ client and enforce the rule that if the client app wishes to send strings with values larger than 127 than they need to first UTF-8 encode the strings themselves.

          tabish Timothy A. Bish added a comment - I think the resolution to this issue will be to remove all string encoding from the C++ client and enforce the rule that if the client app wishes to send strings with values larger than 127 than they need to first UTF-8 encode the strings themselves.

          Removed all string encoding and decoding from Message processing.

          Added class MarshallingSupport to activemq/util with methods that encode and decode strings to modified UTF-8 for those that want to send string data with values between [0...255] but don't want to use a third party lib.

          Users must now encode and decode wide char string as UTF-8 or modified UTF-8 to share string data with a Java or .NET client app.

          tabish Timothy A. Bish added a comment - Removed all string encoding and decoding from Message processing. Added class MarshallingSupport to activemq/util with methods that encode and decode strings to modified UTF-8 for those that want to send string data with values between [0...255] but don't want to use a third party lib. Users must now encode and decode wide char string as UTF-8 or modified UTF-8 to share string data with a Java or .NET client app.

          People

            tabish Timothy A. Bish
            tabish Timothy A. Bish
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: