Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-5586

Handle client disconnects during JoinGroup

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • consumer
    • None

    Description

      If a consumer disconnects with a JoinGroup in-flight, we do not remove it from the group until after the Join phase completes. If the client immediately re-sends the JoinGroup request and it already had a memberId, then the callback will be replaced and there is no harm done. For the other cases:

      1. If the client disconnected due to a failure and does not re-send the JoinGroup, the consumer will still be included in the new group generation after the rebalance completes, but will immediately timeout and trigger a new rebalance.
      2. If the consumer was not a member of the group and re-sends JoinGroup, then a new memberId will be created for that consumer and the old one will not be removed. When the rebalance completes, the old memberId will timeout and a rebalance will be triggered.

      To address these issues, we should add some additional logic to handle client disconnections during the join phase. For newly generated memberIds, we should simply remove them. For existing members, we should probably leave them in the group and reset the heartbeat expiration task.

      Note that we currently have no facility to expose disconnects from the network layer to the other layers, so we need to find a good approach for this.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              hachikuji Jason Gustafson
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated: