Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-16005

ZooKeeper to KRaft migration rollback missing disabling controller and migration configuration on brokers

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.6.1
    • 3.7.0
    • documentation
    • None

    Description

      I was following the latest documentation additions to try the rollback process of a ZK cluster migrating to KRaft, while it's still in dual-write mode: https://github.com/apache/kafka/pull/14160/files#diff-e4e8d893dc2a4e999c96713dd5b5857203e0756860df0e70fb0cb041aa4d347bR3786

      The first point is just about stopping broker, deleting __cluster_metadata folder and restarting broker.

      I think it's missing at least the following steps:

      • removing/disabling the ZooKeeper migration flag
      • removing all properties related to controllers configuration (i.e. controller.quorum.voters, controller.listener.names, ...)

      Without those steps, when the broker restarts, we have got broker re-creating the __cluster_metadata folder (because it syncs with controllers while they are still running).

      Also, when controllers stops, the broker starts to raise exceptions like this:

      [2023-12-13 15:22:28,437] DEBUG [BrokerToControllerChannelManager id=0 name=quorum] Connection with localhost/127.0.0.1 (channelId=1) disconnected (org.apache.kafka.common.network.Selector)java.net.ConnectException: Connection refused    at java.base/sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)    at java.base/sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:777)    at org.apache.kafka.common.network.PlaintextTransportLayer.finishConnect(PlaintextTransportLayer.java:50)    at org.apache.kafka.common.network.KafkaChannel.finishConnect(KafkaChannel.java:224)    at org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:526)    at org.apache.kafka.common.network.Selector.poll(Selector.java:481)    at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:571)    at org.apache.kafka.server.util.InterBrokerSendThread.pollOnce(InterBrokerSendThread.java:109)    at kafka.server.BrokerToControllerRequestThread.doWork(BrokerToControllerChannelManager.scala:421)    at org.apache.kafka.server.util.ShutdownableThread.run(ShutdownableThread.java:130)[2023-12-13 15:22:28,438] INFO [BrokerToControllerChannelManager id=0 name=quorum] Node 1 disconnected. (org.apache.kafka.clients.NetworkClient)[2023-12-13 15:22:28,438] WARN [BrokerToControllerChannelManager id=0 name=quorum] Connection to node 1 (localhost/127.0.0.1:9093) could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient) 

      (where I have controller locally on port 9093)

      Attachments

        Issue Links

          Activity

            People

              ppatierno Paolo Patierno
              ppatierno Paolo Patierno
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: