Uploaded image for project: 'ActiveMQ Classic'
  1. ActiveMQ Classic
  2. AMQ-8398

4-byte Unicode message from JMS to STOMP will be corrupted

Attach filesAttach ScreenshotAdd voteVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 5.16.3
    • None
    • Broker, STOMP, Transport
    • None

    Description

      When sending a message from:
      JMS producer to STOMP consumer
      or
      STOMP producer to JMS consumer
      which contains a 4-byte unicode code points e.g. https://unicode-table.com/en/1F5A4/ there is a corruption of the message.
      In the JMS to STOMP case the code point gets converted to:
      ef bf bd ef bf bd when it should be f0 9f 96 a4.
      and in the STOMP to JMS case the JMS client throws an exception:

      Exception in thread "main" javax.jms.JMSException: java.io.UTFDataFormatException
              at org.apache.activemq.util.JMSExceptionSupport.create(JMSExceptionSupport.java:72)
              at org.apache.activemq.command.ActiveMQTextMessage.decodeContent(ActiveMQTextMessage.java:104)
              at org.apache.activemq.command.ActiveMQTextMessage.getText(ActiveMQTextMessage.java:84)
              at testkonsument.App.JMS(App.java:86)
              at testkonsument.App.main(App.java:42)
      Caused by: java.io.UTFDataFormatException
              at org.apache.activemq.util.MarshallingSupport.convertUTF8WithBuf(MarshallingSupport.java:389)
              at org.apache.activemq.util.MarshallingSupport.readUTF8(MarshallingSupport.java:358)
              at org.apache.activemq.command.ActiveMQTextMessage.decodeContent(ActiveMQTextMessage.java:101)
              ... 3 more
      

      Using 4-byte unicode points
      from STOMP to STOMP
      or
      from JMS to JMS
      is not a problem, both works and does not corrupt the code point.

      Note that 2- (e.g. https://unicode-table.com/en/00F6/) or 3-byte (e.g. https://unicode-table.com/en/2614/) Unicode code points does NOT get corrupted, even if the same message includes a 4-byte Unicode code point.

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            jbonofre Jean-Baptiste Onofré
            simmel Simon Lundstrom

            Dates

              Created:
              Updated:

              Slack

                Issue deployment