Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-15823

NodeToControllerChannelManager: authentication error prevents controller update

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.6.0, 3.5.1
    • 3.8.0
    • core
    • None

    Description

      NodeToControllerChannelManager caches the activeController address in an AtomicReference which is updated when:

      1. activeController has not been set
      2. networkClient disconnnects from the controller
      3. A node replies with `Errors.NOT_CONTROLLER`, and
      4. When a controller changes from Zk mode to Kraft mode

       

      When running multiple Kafka clusters in a dynamic environment, there is a chance that a controller's IP may get reassigned to another cluster's broker when the controller is bounced. In this scenario, the requests from Node to the Controller may fail with an AuthenticationException and are then retried indefinitely. This causes the node to get stuck as the new controller's information is never set.

       

      A potential fix would be disconnect the network client and invoke `updateControllerAddress(null)` as we do in the `Errors.NOT_CONTROLLER` case.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              gnarula Gaurav Narula
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: