Description
Caught this exception in the wild:
java.util.NoSuchElementException: key not found: consumer-group-38981ebe-4361-44e7-b710-7d11f5d35639 at scala.collection.MapLike.default(MapLike.scala:235) at scala.collection.MapLike.default$(MapLike.scala:234) at scala.collection.AbstractMap.default(Map.scala:63) at scala.collection.mutable.HashMap.apply(HashMap.scala:69) at kafka.coordinator.group.GroupMetadata.get(GroupMetadata.scala:214) at kafka.coordinator.group.GroupCoordinator.$anonfun$tryCompleteHeartbeat$1(GroupCoordinator.scala:1008) at scala.runtime.java8.JFunction0$mcZ$sp.apply(JFunction0$mcZ$sp.java:23) at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:253) at kafka.coordinator.group.GroupMetadata.inLock(GroupMetadata.scala:209) at kafka.coordinator.group.GroupCoordinator.tryCompleteHeartbeat(GroupCoordinator.scala:1001) at kafka.coordinator.group.DelayedHeartbeat.tryComplete(DelayedHeartbeat.scala:34) at kafka.server.DelayedOperation.maybeTryComplete(DelayedOperation.scala:122) at kafka.server.DelayedOperationPurgatory$Watchers.tryCompleteWatched(DelayedOperation.scala:391) at kafka.server.DelayedOperationPurgatory.checkAndComplete(DelayedOperation.scala:295) at kafka.coordinator.group.GroupCoordinator.completeAndScheduleNextExpiration(GroupCoordinator.scala:802) at kafka.coordinator.group.GroupCoordinator.completeAndScheduleNextHeartbeatExpiration(GroupCoordinator.scala:795) at kafka.coordinator.group.GroupCoordinator.$anonfun$handleHeartbeat$2(GroupCoordinator.scala:543) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:253) at kafka.coordinator.group.GroupMetadata.inLock(GroupMetadata.scala:209) at kafka.coordinator.group.GroupCoordinator.handleHeartbeat(GroupCoordinator.scala:516) at kafka.server.KafkaApis.handleHeartbeatRequest(KafkaApis.scala:1617) at kafka.server.KafkaApis.handle(KafkaApis.scala:155)
Looking at the logs, I see a coordinator change just prior to this exception. The group was first unloaded as the coordinator moved to another broker and then was loaded again as the coordinator was moved back. I am guessing that somehow the delayed heartbeat is retaining the reference to the old GroupMetadata instance. Not sure exactly how this can happen though.
Attachments
Issue Links
- links to