[KAFKA-16430] The group-metadata-manager thread is always in a loading state and occupies one CPU, unable to end. - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Open
Priority: Blocker
Resolution: Unresolved
Affects Version/s: 2.4.0
Fix Version/s: None
Component/s: group-coordinator
Labels:
None

Flags:

Important

Description

I deployed three broker instances and suddenly found that the client was unable to consume data from certain topic partitions. I first tried to log in to the broker corresponding to the group and used the following command to view the consumer group:

./bin/kafka-consumer-groups.sh --bootstrap-server localhost:9093 --describe --group mygroup

and found the following error:

Error: Executing consumer group command failed due to org.apache.kafka.common.errors.CoodinatorLoadInProgressException: The coodinator is loading and hence can't process requests.

I then discovered that the broker may be stuck in a loop, which is constantly in a loading state. At the same time, I found through the top command that the "group-metadata-manager-0" thread was constantly consuming 100% of the CPU resources. This loop could not be broken, resulting in the inability to consume topic partition data on that node. At this point, I suspected that the issue may be related to the __consumer_offsets partition data file loaded by this thread.
Finally, after restarting the broker instance, everything was back to normal. It's very strange that if there was an issue with the __consumer_offsets partition data file, the broker should have failed to start. Why was it able to automatically recover after a restart? And why did this continuous loop loading of the __consumer_offsets partition data occur?

We encountered this issue in our production environment using Kafka versions 2.2.1 and 2.4.0, and I believe it may also affect other versions.

Attachments

Activity

People

Assignee:: Unassigned

Reporter:: Gao Fei

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 27/Mar/24 07:00

Updated:: 03/Apr/24 10:51