Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
Description
When the metadata manager loads the groups and the offsets of a partition of the __consumer-offsets topic, `GroupMetadataManager.doLoadGroupsAndOffsets` could loop forever if the start offset of the partition is smaller than the end offset and no records are effectively read from the partition.
While the conditions leading to this issue are not clear, I've got the case where a partition was having two segments which were both empty in a cluster. This could theoretically happen when all the tombstones in the first are expired and the second is truncated or when the partition is accidentally corrupted.
As a side effect, the `doLoadGroupsAndOffsets` spins forever, blocks the single thread of the scheduler, blocks the loading of all the groups and offsets which are after in the queue, and blocks the expiration of the offsets.
Attachments
Issue Links
- links to