Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-2978

Topic partition is not sometimes consumed after rebalancing of consumer group



    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • consumer, core
    • None
    • Important


      Hi there, we are evaluating Kafka 0.9 to find if it is stable enough and ready for production. We wrote a tool that basically verifies that each produced message is also properly consumed. We found the issue described below while stressing Kafka using this tool.

      Adding more and more consumers to a consumer group may result in unsuccessful rebalancing. Data from one or more partitions are not consumed and are not effectively available to the client application (e.g. for 15 minutes). Situation can be resolved externally by touching the consumer group again (add or remove a consumer) which forces another rebalancing that may or may not be successful.

      Significantly higher CPU utilization was observed in such cases (from about 3% to 17%). The CPU utilization takes place in both the affected consumer and Kafka broker according to htop and profiling using jvisualvm.

      Jvisualvm indicates the issue may be related to KAFKA-2936 (see its screenshots in the GitHub repo below), but I'm very unsure. I don't also know if the issue is in consumer or broker because both are affected and I don't know Kafka internals.

      The issue is not deterministic but it can be easily reproduced after a few minutes just by executing more and more consumers. More parallelism with multiple CPUs probably gives the issue more chances to appear.

      The tool itself together with very detailed instructions for quite reliable reproduction of the issue and initial analysis are available here:

      My colleague was able to independently reproduce the issue according to the instructions above. If you have any questions or if you need any help with the tool, just let us know. This issue is blocker for us.


        Issue Links



              hachikuji Jason Gustafson
              turek@avast.com Michal Turek
              Guozhang Wang Guozhang Wang
              0 Vote for this issue
              5 Start watching this issue