Description
We noticed when rolling a kraft cluster that it took an unexpectedly long time for one of the brokers to shutdown. In the logs, we saw the following:
Oct 11, 2022 @ 17:53:38.277 [Controller 1] The request from broker 8 to shut down can not yet be granted because the lowest active offset 2283357 is not greater than the broker's shutdown offset 2283358. org.apache.kafka.controller.BrokerHeartbeatManager DEBUG 2Oct 11, 2022 @ 17:53:38.277 [Controller 1] Updated the controlled shutdown offset for broker 8 to 2283362. org.apache.kafka.controller.BrokerHeartbeatManager DEBUG 3Oct 11, 2022 @ 17:53:40.278 [Controller 1] Updated the controlled shutdown offset for broker 8 to 2283366. org.apache.kafka.controller.BrokerHeartbeatManager DEBUG 4Oct 11, 2022 @ 17:53:40.278 [Controller 1] The request from broker 8 to shut down can not yet be granted because the lowest active offset 2283361 is not greater than the broker's shutdown offset 2283362. org.apache.kafka.controller.BrokerHeartbeatManager DEBUG 5Oct 11, 2022 @ 17:53:42.279 [Controller 1] The request from broker 8 to shut down can not yet be granted because the lowest active offset 2283365 is not greater than the broker's shutdown offset 2283366. org.apache.kafka.controller.BrokerHeartbeatManager DEBUG 6Oct 11, 2022 @ 17:53:42.279 [Controller 1] Updated the controlled shutdown offset for broker 8 to 2283370. org.apache.kafka.controller.BrokerHeartbeatManager DEBUG 7Oct 11, 2022 @ 17:53:44.280 [Controller 1] The request from broker 8 to shut down can not yet be granted because the lowest active offset 2283369 is not greater than the broker's shutdown offset 2283370. org.apache.kafka.controller.BrokerHeartbeatManager DEBUG 8Oct 11, 2022 @ 17:53:44.281 [Controller 1] Updated the controlled shutdown offset for broker 8 to 2283374. org.apache.kafka.controller.BrokerHeartbeatManager DEBUG
From what I can tell, it looks like the controller waits until all brokers have caught up to the controlledShutdownOffset of the broker that is shutting down before allowing it to proceed. Probably the intent is to make sure they have all the leader and ISR state.
The problem is that the controlledShutdownOffset seems to be updated after every heartbeat that the controller receives: https://github.com/apache/kafka/blob/trunk/metadata/src/main/java/org/apache/kafka/controller/QuorumController.java#L1996. Unless all other brokers can catch up to that offset before the next heartbeat from the shutting down broker is received, then the broker remains in the shutting down state indefinitely.
In this case, it took more than 40 minutes before the broker completed shutdown:
1Oct 11, 2022 @ 18:36:36.105 [Controller 1] The request from broker 8 to shut down has been granted since the lowest active offset 2288510 is now greater than the broker's controlled shutdown offset 2288510. org.apache.kafka.controller.BrokerHeartbeatManager INFO 2Oct 11, 2022 @ 18:40:35.197 [Controller 1] The request from broker 8 to unfence has been granted because it has caught up with the offset of it's register broker record 2288906. org.apache.kafka.controller.BrokerHeartbeatManager INFO
It seems like the bug here is that we should not keep updating controlledShutdownOffset if it has already been set.
Attachments
Issue Links
- links to