Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-9844

Maximum number of members within a group is not always enforced due to a race condition in join group

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.5.0
    • Fix Version/s: 2.6.0
    • Component/s: None
    • Labels:
      None

      Description

      While analysing https://issues.apache.org/jira/browse/KAFKA-7965, I found out that the maximum number of members constraints is not always enforced due to a race condition.

      When an unknown member joins the group, the group is automatically created if it does not exist. Then, it proceeds with a unknownJoinGroup. On that path, the limit is not enforced because we assumes that the group is empty as this stage because it did not exist. As the lookup and the creation are not protected by a lock, multiple join requests could end up on that path and thus bypass the enforcement.

      Here is example of the logs captured while troubleshooting KAFKA-7965. The test setups 3 consumers and use a limit of 2. The logs show that the three members were able to join the group without being evicted.

      [2020-04-05 13:29:03,145] INFO [Consumer clientId=ConsumerTestConsumer, groupId=group-max-size-test] Discovered group coordinator localhost:36449 (id: 2147483645 rack: null) (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:794)
      [2020-04-05 13:29:03,145] INFO [Consumer clientId=ConsumerTestConsumer, groupId=group-max-size-test] Discovered group coordinator localhost:36449 (id: 2147483645 rack: null) (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:794)
      [2020-04-05 13:29:03,151] INFO [Consumer clientId=ConsumerTestConsumer, groupId=group-max-size-test] Discovered group coordinator localhost:36449 (id: 2147483645 rack: null) (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:794)
      [2020-04-05 13:29:03,153] INFO [Consumer clientId=ConsumerTestConsumer, groupId=group-max-size-test] Attempt to heartbeat failed since member id ConsumerTestConsumer-764a71ea-f9b3-462c-9986-8e6b2530d6e3 is not valid. (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:1054)
      [2020-04-05 13:29:03,155] INFO [Consumer clientId=ConsumerTestConsumer, groupId=group-max-size-test] Giving away all assigned partitions as lost since generation has been reset,indicating that consumer is no longer part of the group (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator:670)
      [2020-04-05 13:29:03,155] INFO [Consumer clientId=ConsumerTestConsumer, groupId=group-max-size-test] Lost previously assigned partitions group-max-size-test-5, group-max-size-test-4 (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator:314)
      [2020-04-05 13:29:03,156] INFO [Consumer clientId=ConsumerTestConsumer, groupId=group-max-size-test] (Re-)joining group (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:551)
      [2020-04-05 13:29:03,154] INFO [Consumer clientId=ConsumerTestConsumer, groupId=group-max-size-test] Attempt to heartbeat failed since member id ConsumerTestConsumer-2d2886ad-1244-4ef7-9e07-62282c3547fd is not valid. (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:1054)
      [2020-04-05 13:29:03,156] INFO [Consumer clientId=ConsumerTestConsumer, groupId=group-max-size-test] Attempt to heartbeat failed since member id ConsumerTestConsumer-42d0fa9d-cfbb-458f-afe9-99a75fef8e08 is not valid. (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:1054)
      [2020-04-05 13:29:03,157] INFO [Consumer clientId=ConsumerTestConsumer, groupId=group-max-size-test] Giving away all assigned partitions as lost since generation has been reset,indicating that consumer is no longer part of the group (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator:670)
      [2020-04-05 13:29:03,158] INFO [Consumer clientId=ConsumerTestConsumer, groupId=group-max-size-test] Lost previously assigned partitions group-max-size-test-2, group-max-size-test-3 (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator:314)
      [2020-04-05 13:29:03,158] INFO [Consumer clientId=ConsumerTestConsumer, groupId=group-max-size-test] (Re-)joining group (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:551)
      [2020-04-05 13:29:03,157] INFO [Consumer clientId=ConsumerTestConsumer, groupId=group-max-size-test] Giving away all assigned partitions as lost since generation has been reset,indicating that consumer is no longer part of the group (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator:670)
      [2020-04-05 13:29:03,159] INFO [Consumer clientId=ConsumerTestConsumer, groupId=group-max-size-test] Lost previously assigned partitions group-max-size-test-1, group-max-size-test-0 (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator:314)
      [2020-04-05 13:29:03,159] INFO [Consumer clientId=ConsumerTestConsumer, groupId=group-max-size-test] (Re-)joining group (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:551)
      [2020-04-05 13:29:03,160] INFO [Consumer clientId=ConsumerTestConsumer, groupId=group-max-size-test] (Re-)joining group (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:551)
      [2020-04-05 13:29:03,161] INFO [GroupCoordinator 2]: Preparing to rebalance group group-max-size-test in state PreparingRebalance with old generation 0 (__consumer_offsets-0) (reason: Adding new member ConsumerTestConsumer-84fd5153-c425-464d-a724-04022a0608f7 with group instanceid None) (kafka.coordinator.group.GroupCoordinator:66)
      [2020-04-05 13:29:03,158] INFO [Consumer clientId=ConsumerTestConsumer, groupId=group-max-size-test] (Re-)joining group (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:551)
      [2020-04-05 13:29:03,160] INFO [Consumer clientId=ConsumerTestConsumer, groupId=group-max-size-test] (Re-)joining group (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:551)
      [2020-04-05 13:29:03,171] INFO [GroupCoordinator 2]: Stabilized group group-max-size-test generation 1 (__consumer_offsets-0) (kafka.coordinator.group.GroupCoordinator:66)
      [2020-04-05 13:29:03,605] INFO [Consumer clientId=ConsumerTestConsumer, groupId=group-max-size-test] Finished assignment for group at generation 1: {ConsumerTestConsumer-84fd5153-c425-464d-a724-04022a0608f7=Assignment(partitions=[group-max-size-test-0, group-max-size-test-1]), ConsumerTestConsumer-e25aedeb-73fd-4fae-b56c-fa929f11a9df=Assignment(partitions=[group-max-size-test-4, group-max-size-test-5]), ConsumerTestConsumer-8ca065a1-2ce4-44d5-881c-c6f01cb0d110=Assignment(partitions=[group-max-size-test-2, group-max-size-test-3])} (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator:605)
      [2020-04-05 13:29:03,606] INFO [GroupCoordinator 2]: Assignment received from leader for group group-max-size-test for generation 1 (kafka.coordinator.group.GroupCoordinator:66)
      [2020-04-05 13:29:03,610] INFO [Consumer clientId=ConsumerTestConsumer, groupId=group-max-size-test] Successfully joined group with generation 1 (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:502)
      [2020-04-05 13:29:03,611] INFO [Consumer clientId=ConsumerTestConsumer, groupId=group-max-size-test] Adding newly assigned partitions: group-max-size-test-1, group-max-size-test-0 (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator:276)
      [2020-04-05 13:29:03,612] INFO [Consumer clientId=ConsumerTestConsumer, groupId=group-max-size-test] Found no committed offset for partition group-max-size-test-1 (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator:1297)
      [2020-04-05 13:29:03,612] INFO [Consumer clientId=ConsumerTestConsumer, groupId=group-max-size-test] Found no committed offset for partition group-max-size-test-0 (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator:1297)
      [2020-04-05 13:29:03,611] INFO [Consumer clientId=ConsumerTestConsumer, groupId=group-max-size-test] Successfully joined group with generation 1 (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:502)
      [2020-04-05 13:29:03,611] INFO [Consumer clientId=ConsumerTestConsumer, groupId=group-max-size-test] Successfully joined group with generation 1 (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:502)
      [2020-04-05 13:29:03,614] INFO [Consumer clientId=ConsumerTestConsumer, groupId=group-max-size-test] Adding newly assigned partitions: group-max-size-test-2, group-max-size-test-3 (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator:276)
      [2020-04-05 13:29:03,614] INFO [Consumer clientId=ConsumerTestConsumer, groupId=group-max-size-test] Adding newly assigned partitions: group-max-size-test-5, group-max-size-test-4 (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator:276)
      [2020-04-05 13:29:03,616] INFO [Consumer clientId=ConsumerTestConsumer, groupId=group-max-size-test] Found no committed offset for partition group-max-size-test-2 (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator:1297)
      [2020-04-05 13:29:03,617] INFO [Consumer clientId=ConsumerTestConsumer, groupId=group-max-size-test] Found no committed offset for partition group-max-size-test-3 (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator:1297)
      [2020-04-05 13:29:03,617] INFO [Consumer clientId=ConsumerTestConsumer, groupId=group-max-size-test] Resetting offset for partition group-max-size-test-1 to offset 0. (org.apache.kafka.clients.consumer.internals.SubscriptionState:383)
      [2020-04-05 13:29:03,617] INFO [Consumer clientId=ConsumerTestConsumer, groupId=group-max-size-test] Found no committed offset for partition group-max-size-test-5 (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator:1297)
      [2020-04-05 13:29:03,618] INFO [Consumer clientId=ConsumerTestConsumer, groupId=group-max-size-test] Found no committed offset for partition group-max-size-test-4 (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator:1297)
      [2020-04-05 13:29:03,619] INFO [Consumer clientId=ConsumerTestConsumer, groupId=group-max-size-test] Resetting offset for partition group-max-size-test-3 to offset 0. (org.apache.kafka.clients.consumer.internals.SubscriptionState:383)
      [2020-04-05 13:29:03,619] INFO [Consumer clientId=ConsumerTestConsumer, groupId=group-max-size-test] Resetting offset for partition group-max-size-test-4 to offset 0. (org.apache.kafka.clients.consumer.internals.SubscriptionState:383)
      [2020-04-05 13:29:03,645] INFO [Consumer clientId=ConsumerTestConsumer, groupId=group-max-size-test] Resetting offset for partition group-max-size-test-2 to offset 0. (org.apache.kafka.clients.consumer.internals.SubscriptionState:383)
      [2020-04-05 13:29:03,646] INFO [Consumer clientId=ConsumerTestConsumer, groupId=group-max-size-test] Resetting offset for partition group-max-size-test-0 to offset 0. (org.apache.kafka.clients.consumer.internals.SubscriptionState:383)
      [2020-04-05 13:29:03,651] INFO [Consumer clientId=ConsumerTestConsumer, groupId=group-max-size-test] Resetting offset for partition group-max-size-test-5 to offset 0. (org.apache.kafka.clients.consumer.internals.SubscriptionState:383)

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                dajac David Jacot
                Reporter:
                dajac David Jacot
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: