Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-10253

Kafka Connect gets into an infinite rebalance loop

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Blocker
    • Resolution: Not A Problem
    • 2.5.0
    • None
    • connect
    • None

    Description

      Hello everyone!

       

      We are running kafka-connect cluster  (3 workers) and very often it gets into an infinite rebalance loop.

       

      2020-07-09 08:51:25,731 INFO [Worker clientId=connect-1, groupId= kafka-connect] Rebalance started (org.apache.kafka.connect.runtime.distributed.WorkerCoordinator) [DistributedHerder-connect-1-1]
      2020-07-09 08:51:25,731 INFO [Worker clientId=connect-1, groupId= kafka-connect] (Re-)joining group (org.apache.kafka.clients.consumer.internals.AbstractCoordinator) [DistributedHerder-connect-1-1]
      2020-07-09 08:51:25,733 INFO [Worker clientId=connect-1, groupId= kafka-connect] Was selected to perform assignments, but do not have latest config found in sync request. Returning an empty configuration to trigger re-sync. (org.apache.kafka.connect.runtime.distributed.IncrementalCooperativeAssignor) [DistributedHerder-connect-1-1]
      2020-07-09 08:51:25,735 INFO [Worker clientId=connect-1, groupId= kafka-connect] Successfully joined group with generation 305655831 (org.apache.kafka.clients.consumer.internals.AbstractCoordinator) [DistributedHerder-connect-1-1]
      2020-07-09 08:51:25,735 INFO [Worker clientId=connect-1, groupId= kafka-connect] Joined group at generation 305655831 with protocol version 2 and got assignment: Assignment{error=1, leader='connect-1-0008abc5-a152-42fe-a697-a4a4641f72bb', leaderUrl='http://10.20.30.221:8083/', offset=12, connectorIds=[], taskIds=[], revokedConnectorIds=[], revokedTaskIds=[], delay=0} with rebalance delay: 0 (org.apache.kafka.connect.runtime.distributed.DistributedHerder) [DistributedHerder-connect-1-1]
      2020-07-09 08:51:25,735 INFO [Worker clientId=connect-1, groupId= kafka-connect] Rebalance started (org.apache.kafka.connect.runtime.distributed.WorkerCoordinator) [DistributedHerder-connect-1-1]
      2020-07-09 08:51:25,735 INFO [Worker clientId=connect-1, groupId= kafka-connect] (Re-)joining group (org.apache.kafka.clients.consumer.internals.AbstractCoordinator) [DistributedHerder-connect-1-1]
      2020-07-09 08:51:25,736 INFO [Worker clientId=connect-1, groupId= kafka-connect] Was selected to perform assignments, but do not have latest config found in sync request. Returning an empty configuration to trigger re-sync. (org.apache.kafka.connect.runtime.distributed.IncrementalCooperativeAssignor) [DistributedHerder-connect-1-1]
      2020-07-09 08:51:25,739 INFO [Worker clientId=connect-1, groupId= kafka-connect] Successfully joined group with generation 305655832 (org.apache.kafka.clients.consumer.internals.AbstractCoordinator) [DistributedHerder-connect-1-1]
      2020-07-09 08:51:25,739 INFO [Worker clientId=connect-1, groupId= kafka-connect] Joined group at generation 305655832 with protocol version 2 and got assignment: Assignment{error=1, leader='connect-1-0008abc5-a152-42fe-a697-a4a4641f72bb', leaderUrl='http://10.20.30.221:8083/', offset=12, connectorIds=[], taskIds=[], revokedConnectorIds=[], revokedTaskIds=[], delay=0} with rebalance delay: 0 (org.apache.kafka.connect.runtime.distributed.DistributedHerder) [DistributedHerder-connect-1-1]
      2020-07-09 08:51:25,739 INFO [Worker clientId=connect-1, groupId= kafka-connect] Rebalance started (org.apache.kafka.connect.runtime.distributed.WorkerCoordinator) [DistributedHerder-connect-1-1]
      2020-07-09 08:51:25,739 INFO [Worker clientId=connect-1, groupId= kafka-connect] (Re-)joining group (org.apache.kafka.clients.consumer.internals.AbstractCoordinator) [DistributedHerder-connect-1-1]
      2020-07-09 08:51:25,740 INFO [Worker clientId=connect-1, groupId= kafka-connect] Was selected to perform assignments, but do not have latest config found in sync request. Returning an empty configuration to trigger re-sync. (org.apache.kafka.connect.runtime.distributed.IncrementalCooperativeAssignor) [DistributedHerder-connect-1-1]
      2020-07-09 08:51:25,742 INFO [Worker clientId=connect-1, groupId= kafka-connect] Successfully joined group with generation 305655833 (org.apache.kafka.clients.consumer.internals.AbstractCoordinator) [DistributedHerder-connect-1-1]
      2020-07-09 08:51:25,742 INFO [Worker clientId=connect-1, groupId= kafka-connect] Joined group at generation 305655833 with protocol version 2 and got assignment: Assignment{error=1, leader='connect-1-0008abc5-a152-42fe-a697-a4a4641f72bb', leaderUrl='http://10.20.30.221:8083/', offset=12, connectorIds=[], taskIds=[], revokedConnectorIds=[], revokedTaskIds=[], delay=0} with rebalance delay: 0 (org.apache.kafka.connect.runtime.distributed.DistributedHerder) [DistributedHerder-connect-1-1]
      2020-07-09 08:51:25,742 INFO [Worker clientId=connect-1, groupId= kafka-connect] Rebalance started (org.apache.kafka.connect.runtime.distributed.WorkerCoordinator) [DistributedHerder-connect-1-1]
      2020-07-09 08:51:25,742 INFO [Worker clientId=connect-1, groupId= kafka-connect] (Re-)joining group (org.apache.kafka.clients.consumer.internals.AbstractCoordinator) [DistributedHerder-connect-1-1]
      2020-07-09 08:51:25,744 INFO [Worker clientId=connect-1, groupId= kafka-connect] Was selected to perform assignments, but do not have latest config found in sync request. Returning an empty configuration to trigger re-sync. (org.apache.kafka.connect.runtime.distributed.IncrementalCooperativeAssignor) [DistributedHerder-connect-1-1]
      2020-07-09 08:51:25,746 INFO [Worker clientId=connect-1, groupId= kafka-connect] Successfully joined group with generation 305655834 (org.apache.kafka.clients.consumer.internals.AbstractCoordinator) [DistributedHerder-connect-1-1]
      2020-07-09 08:51:25,746 INFO [Worker clientId=connect-1, groupId= kafka-connect] Joined group at generation 305655834 with protocol version 2 and got assignment: Assignment{error=1, leader='connect-1-0008abc5-a152-42fe-a697-a4a4641f72bb', leaderUrl='http://10.20.30.221:8083/', offset=12, connectorIds=[], taskIds=[], revokedConnectorIds=[], revokedTaskIds=[], delay=0} with rebalance delay: 0 (org.apache.kafka.connect.runtime.distributed.DistributedHerder) [DistributedHerder-connect-1-1]
      2020-07-09 08:51:25,746 INFO [Worker clientId=connect-1, groupId= kafka-connect] Rebalance started (org.apache.kafka.connect.runtime.distributed.WorkerCoordinator) [DistributedHerder-connect-1-1]
      2020-07-09 08:51:25,746 INFO [Worker clientId=connect-1, groupId= kafka-connect] (Re-)joining group (org.apache.kafka.clients.consumer.internals.AbstractCoordinator) [DistributedHerder-connect-1-1]
      2020-07-09 08:51:25,748 INFO [Worker clientId=connect-1, groupId= kafka-connect] Was selected to perform assignments, but do not have latest config found in sync request. Returning an empty configuration to trigger re-sync. (org.apache.kafka.connect.runtime.distributed.IncrementalCooperativeAssignor) [DistributedHerder-connect-1-1]
      2020-07-09 08:51:25,750 INFO [Worker clientId=connect-1, groupId= kafka-connect] Successfully joined group with generation 305655835 (org.apache.kafka.clients.consumer.internals.AbstractCoordinator) [DistributedHerder-connect-1-1]
      2020-07-09 08:51:25,750 INFO [Worker clientId=connect-1, groupId= kafka-connect] Joined group at generation 305655835 with protocol version 2 and got assignment: Assignment{error=1, leader='connect-1-0008abc5-a152-42fe-a697-a4a4641f72bb', leaderUrl='http://10.20.30.221:8083/', offset=12, connectorIds=[], taskIds=[], revokedConnectorIds=[], revokedTaskIds=[], delay=0} with rebalance delay: 0 (org.apache.kafka.connect.runtime.distributed.DistributedHerder) [DistributedHerder-connect-1-1]
      2020-07-09 08:51:25,750 INFO [Worker clientId=connect-1, groupId= kafka-connect] Rebalance started (org.apache.kafka.connect.runtime.distributed.WorkerCoordinator) [DistributedHerder-connect-1-1]
      2020-07-09 08:51:25,750 INFO [Worker clientId=connect-1, groupId= kafka-connect] (Re-)joining group (org.apache.kafka.clients.consumer.internals.AbstractCoordinator) [DistributedHerder-connect-1-1]
      2020-07-09 08:51:25,751 INFO [Worker clientId=connect-1, groupId= kafka-connect] Was selected to perform assignments, but do not have latest config found in sync request. Returning an empty configuration to trigger re-sync. (org.apache.kafka.connect.runtime.distributed.IncrementalCooperativeAssignor) [DistributedHerder-connect-1-1]
      2020-07-09 08:51:25,754 INFO [Worker clientId=connect-1, groupId= kafka-connect] Successfully joined group with generation 305655836 (org.apache.kafka.clients.consumer.internals.AbstractCoordinator) [DistributedHerder-connect-1-1]
      2020-07-09 08:51:25,754 INFO [Worker clientId=connect-1, groupId= kafka-connect] Joined group at generation 305655836 with protocol version 2 and got assignment: Assignment{error=1, leader='connect-1-0008abc5-a152-42fe-a697-a4a4641f72bb', leaderUrl='http://10.20.30.221:8083/', offset=12, connectorIds=[], taskIds=[], revokedConnectorIds=[], revokedTaskIds=[], delay=0} with rebalance delay: 0 (org.apache.kafka.connect.runtime.distributed.DistributedHerder) [DistributedHerder-connect-1-1]
      2020-07-09 08:51:25,754 INFO [Worker clientId=connect-1, groupId= kafka-connect] Rebalance started (org.apache.kafka.connect.runtime.distributed.WorkerCoordinator) [DistributedHerder-connect-1-1]
      2020-07-09 08:51:25,754 INFO [Worker clientId=connect-1, groupId= kafka-connect] (Re-)joining group (org.apache.kafka.clients.consumer.internals.AbstractCoordinator) [DistributedHerder-connect-1-1]
      2020-07-09 08:51:25,755 INFO [Worker clientId=connect-1, groupId= kafka-connect] Was selected to perform assignments, but do not have latest config found in sync request. Returning an empty configuration to trigger re-sync. (org.apache.kafka.connect.runtime.distributed.IncrementalCooperativeAssignor) [DistributedHerder-connect-1-1]
      2020-07-09 08:51:25,757 INFO [Worker clientId=connect-1, groupId= kafka-connect] Successfully joined group with generation 305655837 (org.apache.kafka.clients.consumer.internals.AbstractCoordinator) [DistributedHerder-connect-1-1]
      2020-07-09 08:51:25,757 INFO [Worker clientId=connect-1, groupId= kafka-connect] Joined group at generation 305655837 with protocol version 2 and got assignment: Assignment{error=1, leader='connect-1-0008abc5-a152-42fe-a697-a4a4641f72bb', leaderUrl='http://10.20.30.221:8083/', offset=12, connectorIds=[], taskIds=[], revokedConnectorIds=[], revokedTaskIds=[], delay=0} with rebalance delay: 0 (org.apache.kafka.connect.runtime.distributed.DistributedHerder) [DistributedHerder-connect-1-1]
      2020-07-09 08:51:25,757 INFO [Worker clientId=connect-1, groupId= kafka-connect] Rebalance started (org.apache.kafka.connect.runtime.distributed.WorkerCoordinator) [DistributedHerder-connect-1-1]
      2020-07-09 08:51:25,757 INFO [Worker clientId=connect-1, groupId= kafka-connect] (Re-)joining group (org.apache.kafka.clients.consumer.internals.AbstractCoordinator) [DistributedHerder-connect-1-1]
      2020-07-09 08:51:25,759 INFO [Worker clientId=connect-1, groupId= kafka-connect] Was selected to perform assignments, but do not have latest config found in sync request. Returning an empty configuration to trigger re-sync. (org.apache.kafka.connect.runtime.distributed.IncrementalCooperativeAssignor) [DistributedHerder-connect-1-1]
      2020-07-09 08:51:25,761 INFO [Worker clientId=connect-1, groupId= kafka-connect] Successfully joined group with generation 305655838 (org.apache.kafka.clients.consumer.internals.AbstractCoordinator) [DistributedHerder-connect-1-1]
      2020-07-09 08:51:25,761 INFO [Worker clientId=connect-1, groupId= kafka-connect] Joined group at generation 305655838 with protocol version 2 and got assignment: Assignment{error=1, leader='connect-1-0008abc5-a152-42fe-a697-a4a4641f72bb', leaderUrl='http://10.20.30.221:8083/', offset=12, connectorIds=[], taskIds=[], revokedConnectorIds=[], revokedTaskIds=[], delay=0} with rebalance delay: 0 (org.apache.kafka.connect.runtime.distributed.DistributedHerder) [DistributedHerder-connect-1-1]
      2020-07-09 08:51:25,761 INFO [Worker clientId=connect-1, groupId= kafka-connect] Rebalance started (org.apache.kafka.connect.runtime.distributed.WorkerCoordinator) [DistributedHerder-connect-1-1]
      2020-07-09 08:51:25,761 INFO [Worker clientId=connect-1, groupId= kafka-connect] (Re-)joining group (org.apache.kafka.clients.consumer.internals.AbstractCoordinator) [DistributedHerder-connect-1-1]
      2020-07-09 08:51:25,763 INFO [Worker clientId=connect-1, groupId= kafka-connect] Was selected to perform assignments, but do not have latest config found in sync request. Returning an empty configuration to trigger re-sync. (org.apache.kafka.connect.runtime.distributed.IncrementalCooperativeAssignor) [DistributedHerder-connect-1-1]
      2020-07-09 08:51:25,766 INFO [Worker clientId=connect-1, groupId= kafka-connect] Successfully joined group with generation 305655839 (org.apache.kafka.clients.consumer.internals.AbstractCoordinator) [DistributedHerder-connect-1-1]
      2020-07-09 08:51:25,766 INFO [Worker clientId=connect-1, groupId= kafka-connect] Joined group at generation 305655839 with protocol version 2 and got assignment: Assignment{error=1, leader='connect-1-0008abc5-a152-42fe-a697-a4a4641f72bb', leaderUrl='http://10.20.30.221:8083/', offset=12, connectorIds=[], taskIds=[], revokedConnectorIds=[], revokedTaskIds=[], delay=0} with rebalance delay: 0 (org.apache.kafka.connect.runtime.distributed.DistributedHerder) [DistributedHerder-connect-1-1]
      2020-07-09 08:51:25,766 INFO [Worker clientId=connect-1, groupId= kafka-connect] Rebalance started (org.apache.kafka.connect.runtime.distributed.WorkerCoordinator) [DistributedHerder-connect-1-1]
      2020-07-09 08:51:25,766 INFO [Worker clientId=connect-1, groupId= kafka-connect] (Re-)joining group (org.apache.kafka.clients.consumer.internals.AbstractCoordinator) [DistributedHerder-connect-1-1]
      2020-07-09 08:51:25,768 INFO [Worker clientId=connect-1, groupId= kafka-connect] Was selected to perform assignments, but do not have latest config found in sync request. Returning an empty configuration to trigger re-sync. (org.apache.kafka.connect.runtime.distributed.IncrementalCooperativeAssignor) [DistributedHerder-connect-1-1]
      2020-07-09 08:51:25,771 INFO [Worker clientId=connect-1, groupId= kafka-connect] Successfully joined group with generation 305655840 (org.apache.kafka.clients.consumer.internals.AbstractCoordinator) [DistributedHerder-connect-1-1]
      2020-07-09 08:51:25,771 INFO [Worker clientId=connect-1, groupId= kafka-connect] Joined group at generation
      
      
      
      

       It is happening in all 3 workers.

       

      And in the broker side we can see following:

      2020-07-09 16:39:46,260 INFO [GroupCoordinator 0]: Preparing to rebalance group kafka-connect in state PreparingRebalance with old generation 311127279 (__consumer_offsets-7) (reason: Updating metadata for member connect-1-bdf1c132-3311-493a-894a-5fffcac41ec7) (kafka.coordinator.group.GroupCoordinator) [data-plane-kafka-request-handler-0]
      2020-07-09 16:39:46,261 INFO [GroupCoordinator 0]: Stabilized group kafka-connect generation 311127280 (__consumer_offsets-7) (kafka.coordinator.group.GroupCoordinator) [data-plane-kafka-request-handler-5]
      2020-07-09 16:39:46,262 INFO [GroupCoordinator 0]: Assignment received from leader for group kafka-connect for generation 311127280 (kafka.coordinator.group.GroupCoordinator) [data-plane-kafka-request-handler-1]
      2020-07-09 16:39:46,265 INFO [GroupCoordinator 0]: Preparing to rebalance group kafka-connect in state PreparingRebalance with old generation 311127280 (__consumer_offsets-7) (reason: Updating metadata for member connect-1-bdf1c132-3311-493a-894a-5fffcac41ec7) (kafka.coordinator.group.GroupCoordinator) [data-plane-kafka-request-handler-1]
      2020-07-09 16:39:46,266 INFO [GroupCoordinator 0]: Stabilized group kafka-connect generation 311127281 (__consumer_offsets-7) (kafka.coordinator.group.GroupCoordinator) [data-plane-kafka-request-handler-6]
      2020-07-09 16:39:46,267 INFO [GroupCoordinator 0]: Assignment received from leader for group kafka-connect for generation 311127281 (kafka.coordinator.group.GroupCoordinator) [data-plane-kafka-request-handler-1]
      2020-07-09 16:39:46,270 INFO [GroupCoordinator 0]: Preparing to rebalance group kafka-connect in state PreparingRebalance with old generation 311127281 (__consumer_offsets-7) (reason: Updating metadata for member connect-1-bdf1c132-3311-493a-894a-5fffcac41ec7) (kafka.coordinator.group.GroupCoordinator) [data-plane-kafka-request-handler-7]
      2020-07-09 16:39:46,271 INFO [GroupCoordinator 0]: Stabilized group kafka-connect generation 311127282 (__consumer_offsets-7) (kafka.coordinator.group.GroupCoordinator) [data-plane-kafka-request-handler-6]
      2020-07-09 16:39:46,272 INFO [GroupCoordinator 0]: Assignment received from leader for group kafka-connect for generation 311127282 (kafka.coordinator.group.GroupCoordinator) [data-plane-kafka-request-handler-1]
      2020-07-09 16:39:46,275 INFO [GroupCoordinator 0]: Preparing to rebalance group kafka-connect in state PreparingRebalance with old generation 311127282 (__consumer_offsets-7) (reason: Updating metadata for member connect-1-bdf1c132-3311-493a-894a-5fffcac41ec7) (kafka.coordinator.group.GroupCoordinator) [data-plane-kafka-request-handler-3]
      2020-07-09 16:39:46,276 INFO [GroupCoordinator 0]: Stabilized group kafka-connect generation 311127283 (__consumer_offsets-7) (kafka.coordinator.group.GroupCoordinator) [data-plane-kafka-request-handler-7]
      2020-07-09 16:39:46,277 INFO [GroupCoordinator 0]: Assignment received from leader for group kafka-connect for generation 311127283 (kafka.coordinator.group.GroupCoordinator) [data-plane-kafka-request-handler-5]
      2020-07-09 16:39:46,280 INFO [GroupCoordinator 0]: Preparing to rebalance group kafka-connect in state PreparingRebalance with old generation 311127283 (__consumer_offsets-7) (reason: Updating metadata for member connect-1-bdf1c132-3311-493a-894a-5fffcac41ec7) (kafka.coordinator.group.GroupCoordinator) [data-plane-kafka-request-handler-5]
      2020-07-09 16:39:46,281 INFO [GroupCoordinator 0]: Stabilized group kafka-connect generation 311127284 (__consumer_offsets-7) (kafka.coordinator.group.GroupCoordinator) [data-plane-kafka-request-handler-7]
      2020-07-09 16:39:46,282 INFO [GroupCoordinator 0]: Assignment received from leader for group kafka-connect for generation 311127284 (kafka.coordinator.group.GroupCoordinator) [data-plane-kafka-request-handler-3]
      2020-07-09 16:39:46,285 INFO [GroupCoordinator 0]: Preparing to rebalance group kafka-connect in state PreparingRebalance with old generation 311127284 (__consumer_offsets-7) (reason: Updating metadata for member connect-1-bdf1c132-3311-493a-894a-5fffcac41ec7) (kafka.coordinator.group.GroupCoordinator) [data-plane-kafka-request-handler-1]
      2020-07-09 16:39:46,286 INFO [GroupCoordinator 0]: Stabilized group kafka-connect generation 311127285 (__consumer_offsets-7) (kafka.coordinator.group.GroupCoordinator) [data-plane-kafka-request-handler-4]
      2020-07-09 16:39:46,287 INFO [GroupCoordinator 0]: Assignment received from leader for group kafka-connect for generation 311127285 (kafka.coordinator.group.GroupCoordinator) [data-plane-kafka-request-handler-7]
      
      

       

      Any feedback is appreciated!

      Thanks!

      Attachments

        Activity

          People

            Unassigned Unassigned
            klalafaryan Konstantin Lalafaryan
            Votes:
            3 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: