Details
-
Improvement
-
Status: Patch Available
-
Major
-
Resolution: Unresolved
-
3.7.0
-
None
Description
When a transaction coordinator for a transactional client shuts down for restart or due to failure, the NetworkClient notices the broker disconnection and will automatically refresh cluster metadata to get the latest partition assignments.
The TransactionManager does not notice any changes until the next transactional request. If the broker is still offline, this is a blocking wait while the client attempts to reconnect to the old coordinator, which can be up to request.timeout.ms long (default 35 seconds). Coordinator lookup is only performed after a transactional request times out and fails. The lookup is triggered in either the Sender
or TransactionalManager's error handling.
To support faster recovery and faster reaction to transaction coordinator reassignments, the TransactionManager should proactively lookup the transaction coordinator whenever the client is disconnected from the current transaction coordinator.
Attachments
Issue Links
- links to