Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
2.7.0
-
None
Description
We have an active-passive setup where the 3 connectors from mirror maker 2 (heartbeat, checkpoint and source) are running on a dedicated Kafka connect cluster on the target cluster.
Offset syncing is enabled as specified by KIP-545. But when activated, it seems the offsets from the source cluster are initially copied to the target cluster without translation. This causes a negative lag for all synced consumer groups. Only when we reset the offsets for each topic/partition on the target cluster and produce a record on the topic/partition in the source, the sync starts working correctly.
I would expect that the consumer groups are synced but that the current offsets of the source cluster are not copied to the target cluster.
This is the configuration we are currently using:
Heartbeat connector
{ "name": "mm2-mirror-heartbeat", "config": { "name": "mm2-mirror-heartbeat", "connector.class": "org.apache.kafka.connect.mirror.MirrorHeartbeatConnector", "source.cluster.alias": "eventador", "target.cluster.alias": "msk", "source.cluster.bootstrap.servers": "<SOURCE_CLUSTER>", "target.cluster.bootstrap.servers": "<TARGET_CLUSTER>", "topics": ".*", "groups": ".*", "tasks.max": "1", "replication.policy.class": "CustomReplicationPolicy", "sync.group.offsets.enabled": "true", "sync.group.offsets.interval.seconds": "5", "emit.checkpoints.enabled": "true", "emit.checkpoints.interval.seconds": "30", "emit.heartbeats.interval.seconds": "30", "key.converter": " org.apache.kafka.connect.converters.ByteArrayConverter", "value.converter": "org.apache.kafka.connect.converters.ByteArrayConverter" } }
Checkpoint connector:
{ "name": "mm2-mirror-checkpoint", "config": { "name": "mm2-mirror-checkpoint", "connector.class": "org.apache.kafka.connect.mirror.MirrorCheckpointConnector", "source.cluster.alias": "eventador", "target.cluster.alias": "msk", "source.cluster.bootstrap.servers": "<SOURCE_CLUSTER>", "target.cluster.bootstrap.servers": "<TARGET_CLUSTER>", "topics": ".*", "groups": ".*", "tasks.max": "40", "replication.policy.class": "CustomReplicationPolicy", "sync.group.offsets.enabled": "true", "sync.group.offsets.interval.seconds": "5", "emit.checkpoints.enabled": "true", "emit.checkpoints.interval.seconds": "30", "emit.heartbeats.interval.seconds": "30", "key.converter": " org.apache.kafka.connect.converters.ByteArrayConverter", "value.converter": "org.apache.kafka.connect.converters.ByteArrayConverter" } }
Source connector:
{ "name": "mm2-mirror-source", "config": { "name": "mm2-mirror-source", "connector.class": "org.apache.kafka.connect.mirror.MirrorSourceConnector", "source.cluster.alias": "eventador", "target.cluster.alias": "msk", "source.cluster.bootstrap.servers": "<SOURCE_CLUSTER>", "target.cluster.bootstrap.servers": "<TARGET_CLUSTER>", "topics": ".*", "groups": ".*", "tasks.max": "40", "replication.policy.class": "CustomReplicationPolicy", "sync.group.offsets.enabled": "true", "sync.group.offsets.interval.seconds": "5", "emit.checkpoints.enabled": "true", "emit.checkpoints.interval.seconds": "30", "emit.heartbeats.interval.seconds": "30", "key.converter": " org.apache.kafka.connect.converters.ByteArrayConverter", "value.converter": "org.apache.kafka.connect.converters.ByteArrayConverter" } }
Attachments
Issue Links
- causes
-
KAFKA-14797 MM2 does not emit offset syncs when conservative translation logic exceeds positive max.offset.lag
- Resolved
- is related to
-
KAFKA-13932 Replication data loss in some cases
- Resolved
-
KAFKA-15144 MM2 Checkpoint downstreamOffset stuck to 1
- Resolved
-
KAFKA-13562 Mirror Maker 2 Negative Offsets
- Resolved
-
KAFKA-14666 MM2 should translate consumer group offsets behind replication flow
- Resolved
- relates to
-
KAFKA-16291 Mirrormaker2 wrong checkpoints
- Open
-
KAFKA-15202 MM2 OffsetSyncStore clears too many syncs when sync spacing is variable
- Resolved
-
KAFKA-15905 Restarts of MirrorCheckpointTask should not permanently interrupt offset translation
- Resolved
-
KAFKA-15906 Emit offset syncs more often than offset.lag.max for low-throughput/finite partitions
- Resolved
-
KAFKA-16303 Add upgrade notes about recent MM2 offset translation changes
- Resolved
- Testing discovered
-
KAFKA-14663 High throughput topics can starve low-throughput MM2 offset syncs
- Resolved
-
KAFKA-14727 Connect EOS mode should periodically call task commit
- Resolved
- links to