Details
-
Bug
-
Status: Resolved
-
Blocker
-
Resolution: Fixed
-
3.6.0
-
None
Description
When asked to perform a round of zombie fencing, the distributed herder will reject the request if a rebalance is pending, which can happen if (among other things) a config for a new connector or a new set of task configs has been recently read from the config topic.
Normally this can be alleviated with a simple task restart, which isn't great but isn't terrible.
However, when running MirrorMaker 2 in dedicated mode, there is no API to restart failed tasks, and it can be more common to see this kind of failure on a fresh cluster because three connector configurations are written in rapid succession to the config topic.
In order to provide a better experience for users of both vanilla Kafka Connect and dedicated MirrorMaker 2 clusters, we can retry (likely with the same exponential backoff introduced with KAFKA-14732) zombie fencing attempts that fail due to a pending rebalance.
Attachments
Issue Links
- causes
-
KAFKA-14718 Flaky DedicatedMirrorIntegrationTest test suite
- Resolved
- links to