Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-15676

Scheduled rebalance delay for Connect is unnecessarily triggered when group coordinator bounces

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • connect
    • None

    Description

      When a Connect worker loses contact with the group coordinator, it voluntarily gives up (i.e., stops) its assignment of connectors and tasks (for more context, see KAFKA-9184). However, this change in state is not relayed to the worker's instance of the IncrementalCooperativeAssignor class.

      If the group coordinator for a Connect cluster is unavailable for long enough, all of the workers in the cluster will revoke their assigned connectors and tasks and, upon rejoining the group, report that they have been assigned no connectors and tasks.

      If a worker's member ID is reset before rejoining the group (which can happen if, for example, the maximum poll interval for the worker is exceeded), the leader of the cluster will not act as if the worker had rejoined the group; instead, it will act as if the worker had left the group and a new, unrelated worker had joined during the same rebalance. This will cause the scheduled rebalance delay to be triggered, and for the connectors and tasks previously-assigned to that worker to remain unassigned until the delay expires.

      Attachments

        Activity

          People

            Unassigned Unassigned
            ChrisEgerton Chris Egerton
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: