Details
-
New Feature
-
Status: Closed
-
Major
-
Resolution: Duplicate
-
None
-
None
-
None
Description
Transaction manager should guarantee that, once a pre-commit/pre-abort request is acknowledged, commit/abort request will be delivered to partitions involved in the transaction.
In particular, we handle the following failover scenarios:
1) Transaction manager or its followers fail before txRequest is duplicated on local log and followers.
Solution: Transaction manager responds to request with error status if it is alive. The producer keeps trying commit.
2) The txPartition’s leader is not available.
Solution: Put txRequest on unSentTxRequestQueue. When metadataCache is updated, check and re-send txRequest from unSentTxRequestQueue if possible.
3) The txPartition’s leader fails when txRequest is in channel manager.
Solution: Retrieve all txRequests queued for transmission to this broker and put them on unSentTxRequestQueue.
4) Transaction manage does not receive success response from txPartition’s leaders within timeout period.
Solution: Transaction manager expires the txRequest and re-send it.
5) Transaction manager fails.
Solution: The new transaction manager reads transactionHW from zookeeper, and sends txRequest starting from the transactionHW.