Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-5564

Fail to create topics with error 'While recording the replica LEO, the partition [topic2,0] hasn't been created'

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 0.9.0.1
    • None
    • None
    • None

    Description

      Hello,

      Short version
      we have seen sporadic occurrences of the following issue: Topics whose leader is a specific broker fail to be created properly, and it is impossible to produce to them or consume from them.
      The following logs appears in the broker that is the leader of the faulty topics:

      [2017-07-05 05:22:15,564] WARN [Replica Manager on Broker 3]: While recording the replica LEO, the partition [topic2,0] hasn't been created. (kafka.server.ReplicaManager)
      

      Detailed version:
      Our setup consists of three brokers with ids 1, 2, 3. Broker 2 is the controller. We create 7 topics called topic1, topic2, topic3, topic4, topic5, topic6, topic7.

      Sometimes (sporadically) some of the topics are faulty. In the particular example I describe here the faulty topics are topics are topic6, topic4, topic2, topic3. The faulty topics all have the same leader broker 3.

      If we do a kafka-topics.sh --describe on the topics we see that for topics that do not have broker 3 as leader, the in sync replicas report that broker 3 is not synced:

       bin/kafka-topics.sh --describe --zookeeper zookeeper:2181/kafka
      Topic:topic6	PartitionCount:1	ReplicationFactor:3	Configs:
      	Topic: topic6	Partition: 0	Leader: 3	Replicas: 3,1,2	Isr: 3,1,2
      Topic:topic5	PartitionCount:1	ReplicationFactor:3	Configs:retention.ms=300000
      	Topic: topic5	Partition: 0	Leader: 2	Replicas: 2,3,1	Isr: 2,1
      Topic:topic7	PartitionCount:1	ReplicationFactor:3	Configs:
      	Topic: topic7	Partition: 0	Leader: 1	Replicas: 1,3,2	Isr: 1,2
      Topic:topic4	PartitionCount:1	ReplicationFactor:3	Configs:
      	Topic: topic4	Partition: 0	Leader: 3	Replicas: 3,1,2	Isr: 3,1,2
      Topic:topic1	PartitionCount:1	ReplicationFactor:3	Configs:
      	Topic: topic1	Partition: 0	Leader: 2	Replicas: 2,1,3	Isr: 2,1
      Topic:topic2	PartitionCount:1	ReplicationFactor:3	Configs:
      	Topic: topic2	Partition: 0	Leader: 3	Replicas: 3,1,2	Isr: 3,1,2
      Topic:topic3	PartitionCount:1	ReplicationFactor:3	Configs:
      	Topic: topic3	Partition: 0	Leader: 3	Replicas: 3,1,2	Isr: 3,1,2
      

      While for the faulty topics it is reported that all replicas are in sync.

      Also, the topic directories under the log.dir folder were not created in the faulty broker 3.

      We see the following logs in broker 3, which is the leader of the faulty topics:

      [2017-07-05 05:22:15,564] WARN [Replica Manager on Broker 3]: While recording the replica LEO, the partition [topic2,0] hasn't been created. (kafka.server.ReplicaManager)
      

      The above log is logged continuously.

      and the following error logs in the other 2 brokers, the replicas:

      ERROR [ReplicaFetcherThread-0-3], Error for partition [topic3,0] to broker 3:org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This server does not host this topic-partition
      

      Again the above log is logged continuously.

      The issue described above occurs immediately after the deployment of the kafka cluster.
      A restart of the faulty broker (3 in this case) fixes the problem and the faulty topics work normally.

      I have also attached the broker configuration we use.

      Do you have any idea what might cause this issue?

      Best regards,

      Klearchos

      Attachments

        Activity

          People

            Unassigned Unassigned
            kihaloul Klearchos Chaloulos
            Votes:
            1 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: