Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-7122

Data is lost when ZooKeeper times out



    • Type: Bug
    • Status: Open
    • Priority: Blocker
    • Resolution: Unresolved
    • Affects Version/s:
    • Fix Version/s: None
    • Component/s: core, replication
    • Labels:


      Noticed that a kafka cluster will lose data when a leader for a partition has their zookeeper connection timeout.

      Sequence of events:

      1. Say broker A leads a partition followed by brokers B and C
      2. A ZK node has a network issue, happens to be the node used by broker A. Lets say this happens at offset X
      3. Kafka Controller immediately selects broker C as the new partition leader
      4. Broker A does not timeout from zookeeper for another 4 seconds. Broker A still thinks it is the leader, presumably accepting producer writes.
      5. Broker A detects the ZK timeout and leaves the ISR.
      6. Broker A reconnects to ZK, rejoins cluster as follower for partition
      7. Broker A truncates log to some offset Y such that Y > X. Broker A proceeds to catch up normally and becomes an ISR
      8. ISRs for partition are now in an inconsistent state:
        1. Broker C has all offsets X through Y plus everything after
        2. Broker B has all offsets X through Y plus everything after
        3. Broker A has offsets up to X and after Y. Everything between X and Y IS MISSING
      9. Within 5 minutes, controller trigger preferred replica election making Broker A the new leader for partition (this is default behavior)

      All consumers after step 9 will not receive any messages for offsets between X and Y.


      The root problem here seems to be broker A truncates to offset Y when rejoining the cluster. It should be truncating further back to offset X to prevent data loss





            • Assignee:
              NickLipple Nick Lipple
            • Votes:
              0 Vote for this issue
              6 Start watching this issue


              • Created: