In InternalTopicManager, we have a hardcoded constant MAX_TOPIC_READY_TRY that is set to 5. We shouldn't hardcode the retry time and it should be based on a timeout, not on a number of retries.
There are cases when the code in makeReady tries to create a topic but then fails because the controller is currently in transition and we get a warning: " Could not create internal topics: Could not create topic: <topic name> due to This is not the correct controller for this cluster." The code proceeds to retry MAX_TOPIC_READY_TRY times in a tight loop, and eventually fails. We should have a retry backoff (perhaps just use retry.backoff.ms) and a timeout (perhaps just use request.timeout.ms) instead of a number of retries.