Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 0.7
    • Fix Version/s: None
    • Component/s: core
    • Labels:
      None

      Description

      There are two cases where correct partitioning fails for a new topic.

      Case 1: Topic exists on current Kafka cluster. A new broker is added to the cluster. The new broker will never host partitions for the existing topic.

      To reproduce:
      1) Create a cluster of brokers along with a ZooKeeper ensemble.
      2) Send messages for a topic to the cluster.
      3) Add a new broker to the cluster.
      4) New broker will never see the existing topic.

      Case 2: Topic does not exist on current Kafka cluster. Producer sends messages to a new topic that did not previously exist in the cluster. If, during the producer session, one or more partitions are not created on a broker, the broker will never host those partitions.

      To reproduce:
      1) Create a cluster of brokers along with a ZooKeeper ensemble.
      2) Send messages to a new topic.
      3) Shut down the producer before the topic is created on at least one broker.
      4) The broker that did not allocate the topic will never host the topic.

      My guess here is that when a new producer is created, it gets a list of topics and partitions based on the current state of the brokers in the cluster. Since some brokers are missing the topic, the producer will never send messages to that broker and partitions will never be created.

      Work around:
      Manually create the topic/partition directories in the kafka logs directory and reboot kafka. It will register the topic/partitions in ZooKeeper.

        Activity

        Matt created issue -
        Matt made changes -
        Field Original Value New Value
        Summary Topic is never distributed to nodes added to an existing cluster Issues partitioning a new topic
        Description If you add one or more brokers to an existing cluster, existing topics will never be seen by the new brokers.

        To reproduce:
        1) Create a cluster of brokers along with a ZooKeeper ensemble.
        2) Send messages for a topic to the cluster.
        3) Add a new broker to the cluster.
        4) New broker will never see the existing topic.
        There are two cases where correct partitioning fails for a new topic.

        Case 1: Topic exists on current Kafka cluster. A newly added broker to the cluster will never be notified of the topic and will never host partitions.

        To reproduce:
        1) Create a cluster of brokers along with a ZooKeeper ensemble.
        2) Send messages for a topic to the cluster.
        3) Add a new broker to the cluster.
        4) New broker will never see the existing topic.

        Case 2: Topic does not exist on current Kafka cluster. Session in which producer sends first set of messages for new topic does not last long enough for bootstrapping. Partitions will not be allocated on all brokers.

        To reproduce:
        1) Create a cluster of brokers along with a ZooKeeper ensemble.
        2) Send a "small" number of messages to a new topic, then shut down producer.
        3) Partitions will be allocated on a random(?) set of brokers. Not all brokers will host partitions.
        4) Brokers that did not allocate partitions for topic will never see the new topic.

        For case 2, I'm not sure about the exact length of time, or the number of messages required before all brokers have partitions for the new topic.
        Matt made changes -
        Description There are two cases where correct partitioning fails for a new topic.

        Case 1: Topic exists on current Kafka cluster. A newly added broker to the cluster will never be notified of the topic and will never host partitions.

        To reproduce:
        1) Create a cluster of brokers along with a ZooKeeper ensemble.
        2) Send messages for a topic to the cluster.
        3) Add a new broker to the cluster.
        4) New broker will never see the existing topic.

        Case 2: Topic does not exist on current Kafka cluster. Session in which producer sends first set of messages for new topic does not last long enough for bootstrapping. Partitions will not be allocated on all brokers.

        To reproduce:
        1) Create a cluster of brokers along with a ZooKeeper ensemble.
        2) Send a "small" number of messages to a new topic, then shut down producer.
        3) Partitions will be allocated on a random(?) set of brokers. Not all brokers will host partitions.
        4) Brokers that did not allocate partitions for topic will never see the new topic.

        For case 2, I'm not sure about the exact length of time, or the number of messages required before all brokers have partitions for the new topic.
        There are two cases where correct partitioning fails for a new topic.

        Case 1: Topic exists on current Kafka cluster. A new broker is added to the cluster. The new broker will never host partitions for the existing topic.

        To reproduce:
        1) Create a cluster of brokers along with a ZooKeeper ensemble.
        2) Send messages for a topic to the cluster.
        3) Add a new broker to the cluster.
        4) New broker will never see the existing topic.

        Case 2: Topic does not exist on current Kafka cluster. Producer sends messages to a new topic that did not previously exist in the cluster. If, during the producer session, one or more partitions are not created on a broker, the broker will never host those partitions.

        To reproduce:
        1) Create a cluster of brokers along with a ZooKeeper ensemble.
        2) Send messages to a new topic.
        3) Shut down the producer before the topic is created on at least one broker.
        4) The broker that did not allocate the topic will never host the topic.

        My guess(!) here is that when a new producer is created, it gets a list of topics and partitions based on the current state of the brokers in the cluster. Since some brokers are missing the topic, the producer will never send messages to that broker and partitions will never be created.
        Matt made changes -
        Description There are two cases where correct partitioning fails for a new topic.

        Case 1: Topic exists on current Kafka cluster. A new broker is added to the cluster. The new broker will never host partitions for the existing topic.

        To reproduce:
        1) Create a cluster of brokers along with a ZooKeeper ensemble.
        2) Send messages for a topic to the cluster.
        3) Add a new broker to the cluster.
        4) New broker will never see the existing topic.

        Case 2: Topic does not exist on current Kafka cluster. Producer sends messages to a new topic that did not previously exist in the cluster. If, during the producer session, one or more partitions are not created on a broker, the broker will never host those partitions.

        To reproduce:
        1) Create a cluster of brokers along with a ZooKeeper ensemble.
        2) Send messages to a new topic.
        3) Shut down the producer before the topic is created on at least one broker.
        4) The broker that did not allocate the topic will never host the topic.

        My guess(!) here is that when a new producer is created, it gets a list of topics and partitions based on the current state of the brokers in the cluster. Since some brokers are missing the topic, the producer will never send messages to that broker and partitions will never be created.
        There are two cases where correct partitioning fails for a new topic.

        Case 1: Topic exists on current Kafka cluster. A new broker is added to the cluster. The new broker will never host partitions for the existing topic.

        To reproduce:
        1) Create a cluster of brokers along with a ZooKeeper ensemble.
        2) Send messages for a topic to the cluster.
        3) Add a new broker to the cluster.
        4) New broker will never see the existing topic.

        Case 2: Topic does not exist on current Kafka cluster. Producer sends messages to a new topic that did not previously exist in the cluster. If, during the producer session, one or more partitions are not created on a broker, the broker will never host those partitions.

        To reproduce:
        1) Create a cluster of brokers along with a ZooKeeper ensemble.
        2) Send messages to a new topic.
        3) Shut down the producer before the topic is created on at least one broker.
        4) The broker that did not allocate the topic will never host the topic.

        My guess(!) here is that when a new producer is created, it gets a list of topics and partitions based on the current state of the brokers in the cluster. Since some brokers are missing the topic, the producer will never send messages to that broker and partitions will never be created.


        Work around:
        Manually create the topic/partition directories in the kafka logs directory and reboot kafka. It will register the topic/partitions in ZooKeeper.
        Nick Howard made changes -
        Attachment bootstrap_new_brokers.patch [ 12538463 ]
        Jay Kreps made changes -
        Status Open [ 1 ] Resolved [ 5 ]
        Resolution Fixed [ 1 ]

          People

          • Assignee:
            Unassigned
            Reporter:
            Matt
          • Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development