Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-4479

Streams tests should pass without hardcoded Time.SYSTEM in GroupCoordinator

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Reopened
    • Minor
    • Resolution: Unresolved
    • None
    • None
    • streams, unit tests

    Description

      If we pass `KafkaServer.time` to `GroupCoordinator`[1], some streams tests like QueryableStateIntegrationTest fail sem-regularly. damianguy looked into it and described it as:

      Looking at the sequence of events, one thread is stopped, and hence leaves the group triggering a rebalance, but the other thread doesn’t seem to get the memo, tries to commit, fails, and then game-over.

      So.. the case that it fails the one alive thread is not getting a rebalance. This would happen during a `poll(..)` right? However i can see the thread is polling many times after the other thread has shutdown.

      It tries to commit every time around the loop, so:

      poll(..)
      process(..)
      maybeCommit(..)

      and there is like < 10ms between calls to `poll`.

      A theory was that the mock time was not advancing enough to trigger a rebalance in the group coordinator. However, the consumer is closed, so that should trigger a `LeaveGroup` request and it's unclear why a rebalance is not triggered for the live consumer.

      PR where this issue was first seen and discussed: https://github.com/apache/kafka/pull/2095

      [1] https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/server/KafkaServer.scala#L222

      Attachments

        1. test.zip
          90 kB
          Amit Daga

        Activity

          People

            Unassigned Unassigned
            ijuma Ismael Juma
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated: