Details
-
Bug
-
Status: In Progress
-
Critical
-
Resolution: Unresolved
-
3.8.0
Description
In stress testing the new consumer, the new consumer is evicting fetch sessions on the broker much more frequently than expected. There is an ongoing investigation into this behavior, but it appears to stem from a race condition due to the design of the new consumer.
In the background thread, fetch requests are sent in a near continuous fashion for partitions that are "fetchable." A timing bug appears to cause partitions to be "unfetchable," which then causes them to end up in the "removed" set of partitions. The broker then removes them from the fetch session, which causes the number of remaining partitions for that session to drop below a threshold that allows it to be evicted by another competing session. Within a few milliseconds, though, the partitions become "fetchable" again, and are added to the "added" set of partitions on the next fetch request. This causes thrashing on both the client and broker sides as both are handling a steady stream of evictions, which negatively affects consumption throughput.
Attachments
Issue Links
- relates to
-
KAFKA-17439 Make polling for new records an explicit action/event in the new consumer
- Patch Available
- links to