[KAFKA-12468] Initial offsets are copied from source to target cluster - ASF JIRA

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 2.7.0
Fix Version/s: 3.5.0, 3.4.1, 3.3.3
Component/s: mirrormaker
Labels:
None

Description

We have an active-passive setup where the 3 connectors from mirror maker 2 (heartbeat, checkpoint and source) are running on a dedicated Kafka connect cluster on the target cluster.

Offset syncing is enabled as specified by KIP-545. But when activated, it seems the offsets from the source cluster are initially copied to the target cluster without translation. This causes a negative lag for all synced consumer groups. Only when we reset the offsets for each topic/partition on the target cluster and produce a record on the topic/partition in the source, the sync starts working correctly.

I would expect that the consumer groups are synced but that the current offsets of the source cluster are not copied to the target cluster.

This is the configuration we are currently using:

Heartbeat connector

{
  "name": "mm2-mirror-heartbeat",
  "config": {
    "name": "mm2-mirror-heartbeat",
    "connector.class": "org.apache.kafka.connect.mirror.MirrorHeartbeatConnector",
    "source.cluster.alias": "eventador",
    "target.cluster.alias": "msk",
    "source.cluster.bootstrap.servers": "<SOURCE_CLUSTER>",
    "target.cluster.bootstrap.servers": "<TARGET_CLUSTER>",
    "topics": ".*",
    "groups": ".*",
    "tasks.max": "1",
    "replication.policy.class": "CustomReplicationPolicy",
    "sync.group.offsets.enabled": "true",
    "sync.group.offsets.interval.seconds": "5",
    "emit.checkpoints.enabled": "true",
    "emit.checkpoints.interval.seconds": "30",
    "emit.heartbeats.interval.seconds": "30",
    "key.converter": " org.apache.kafka.connect.converters.ByteArrayConverter",
    "value.converter": "org.apache.kafka.connect.converters.ByteArrayConverter"
  }
}

Checkpoint connector:

{
  "name": "mm2-mirror-checkpoint",
  "config": {
    "name": "mm2-mirror-checkpoint",
    "connector.class": "org.apache.kafka.connect.mirror.MirrorCheckpointConnector",
    "source.cluster.alias": "eventador",
    "target.cluster.alias": "msk",
    "source.cluster.bootstrap.servers": "<SOURCE_CLUSTER>",
    "target.cluster.bootstrap.servers": "<TARGET_CLUSTER>",
    "topics": ".*",
    "groups": ".*",
    "tasks.max": "40",
    "replication.policy.class": "CustomReplicationPolicy",
    "sync.group.offsets.enabled": "true",
    "sync.group.offsets.interval.seconds": "5",
    "emit.checkpoints.enabled": "true",
    "emit.checkpoints.interval.seconds": "30",
    "emit.heartbeats.interval.seconds": "30",
    "key.converter": " org.apache.kafka.connect.converters.ByteArrayConverter",
    "value.converter": "org.apache.kafka.connect.converters.ByteArrayConverter"
  }
}

Source connector:

{
  "name": "mm2-mirror-source",
  "config": {
    "name": "mm2-mirror-source",
    "connector.class": "org.apache.kafka.connect.mirror.MirrorSourceConnector",
    "source.cluster.alias": "eventador",
    "target.cluster.alias": "msk",
    "source.cluster.bootstrap.servers": "<SOURCE_CLUSTER>",
    "target.cluster.bootstrap.servers": "<TARGET_CLUSTER>",
    "topics": ".*",
    "groups": ".*",
    "tasks.max": "40",
    "replication.policy.class": "CustomReplicationPolicy",
    "sync.group.offsets.enabled": "true",
    "sync.group.offsets.interval.seconds": "5",
    "emit.checkpoints.enabled": "true",
    "emit.checkpoints.interval.seconds": "30",
    "emit.heartbeats.interval.seconds": "30",
    "key.converter": " org.apache.kafka.connect.converters.ByteArrayConverter",
    "value.converter": "org.apache.kafka.connect.converters.ByteArrayConverter"
  }
}

Attachments

Issue Links

causes

KAFKA-14797 MM2 does not emit offset syncs when conservative translation logic exceeds positive max.offset.lag

Resolved

is related to

KAFKA-13932 Replication data loss in some cases

Resolved

KAFKA-15144 MM2 Checkpoint downstreamOffset stuck to 1

Resolved

KAFKA-13562 Mirror Maker 2 Negative Offsets

Resolved

KAFKA-14666 MM2 should translate consumer group offsets behind replication flow

Resolved

relates to

KAFKA-16291 Mirrormaker2 wrong checkpoints

Open

KAFKA-15202 MM2 OffsetSyncStore clears too many syncs when sync spacing is variable

Resolved

KAFKA-15905 Restarts of MirrorCheckpointTask should not permanently interrupt offset translation

Resolved

KAFKA-15906 Emit offset syncs more often than offset.lag.max for low-throughput/finite partitions

Resolved

KAFKA-16303 Add upgrade notes about recent MM2 offset translation changes

Resolved

Testing discovered

KAFKA-14663 High throughput topics can starve low-throughput MM2 offset syncs

Resolved

KAFKA-14727 Connect EOS mode should periodically call task commit

Resolved

links to

GitHub Pull Request #13178

(5 relates to, 2 Testing discovered, 1 links to)

Initial offsets are copied from source to target cluster

Details

Description

Attachments

Issue Links

Activity

People

Dates