[QPID-8546] [Broker-J] BDB HA replication of atomic BDB transactions for message acknowldgements impacts consumer performance - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: qpid-java-broker-8.0.5
Fix Version/s: qpid-java-broker-8.0.6
Component/s: Broker-J
Labels:
None

Description

With current Qpid Broker versions (up to 8.0.4 including) when BDB HA cluster is distributed across different DCs, a consumer throughput falls significantly behind the producer . As result, the total system throughput (determined by the consumer) can be very low.

Such behavior is caused by implementation specifics of BDB message store:

BDB HA transactions are synchronous for the majority of the nodes. (The messaging transaction is committed when majority of the nodes in the cluster acknowledge the transaction on their side)
The deletion of dequeued messages from the store is synchronous and impacted by the DC latency.
2 separate underlying store transactions are used to delete each individual message data. Thus, if message consumption happens in transactional batches of N, the messages are dequeued from the queue in the transacted batch of N messages. However, after message dequeuing, the unreferenced message data is deleted using its own store transactions (one for content and another for metadata). As result for a batch of N, the store runs 1 + N*2 transactions. The N*2 transactions are synchronous and executed one after another. If a latency between data centers is high, it adds a corresponding delay to each store transaction. As result, the message removal takes time. Only after deletion of N messages, the broker sends commit confirmation to the client. The message enqueing works differently.

Attachments

Activity

People

Assignee:: Unassigned

Reporter:: Dedeepya

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 01/Jul/21 04:57

Updated:: 29/Aug/21 19:04

Resolved:: 22/Aug/21 18:26