Currently, if a broker is bounced without controlled shutdown and there are several clients talking to the Kafka cluster, each of the clients realize the unavailability of leaders for some partitions. This leads to several metadata requests sent to the Kafka brokers. Since metadata requests are pretty slow, all the I/O threads quickly become busy serving the metadata requests. This leads to a full request queue, that stalls handling of finished responses since the same network thread handles requests as well as responses. In this situation, clients timeout on metadata requests and send more metadata requests. This quickly makes the Kafka cluster unavailable.