The performance of asynchronous publishing of transient messages into topic exchange which routes messages into queues bound using non-overlapping selectors is 2-3 times slower than performance of 0.32 broker. The performance degradation is observed with AMQP 0.9, though, I suspect that the AMQP 0-10 protocol could be affected as well.
I was running tests with 10 concurrent producers publishing messages on separate connections using the same routing key into 10 different queues (subscribers queues) bound to the exchange using non-overlapping selectors.
My testing showed that performance of 7.0 broker for this particular use case was 2-3 times worse than performance of 0.32 broker.
The following factors contributed to degradation of performance:
• Copying data from direct memory into heap memory whilst decoding message headers. Due to this factor, the decoding of message headers is around twice slower. It seems it contributes around 70% to total performance degradation
• The message routing algorithm is slower due to need to support a new feature to route messages into bound exchanges (in addition to queues) using replacement routing key.
• AMQ short strings caching contributes 5-10% to total performance degradation. The caching was added to manage heap space more efficiently.
The numbers provided here could be inaccurate due instrumentation overhead whilst profiling the issue.
Potentially, caching can be turned off but that will not improve performance much.
On other hand, adding of additional caching of strings to amqp-short-strings would improve the performance a bit. Whilst evaluating selectors, the fields used in selector expressions are represented as java strings but they get converted every time into amqp-short-strings when looking up for message header values. If 10 queues are bound to the exchange using the same binding key, the selector expression is evaluated 10 times for the incoming message. Thus, all selector field names are get converted into amqp-short-strings 10 times as well. It seems adding caching here can improve the performance.