Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-1561

Data Loss for Incremented Replica Factor and Leader Election


    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s:
    • Component/s: None
    • Labels:


      This is reported on the mailing list (thanks to Jad).


      I have a test that continuously sends messages to one broker, brings up
      another broker, and adds it as a replica for all partitions, with it being
      the preferred replica for some. I have auto.leader.rebalance.enable=true,
      so replica election gets triggered. Data is being pumped to the old broker
      all the while. It seems that some data gets lost while switching over to
      the new leader. Is this a bug, or do I have something misconfigured? I also
      have request.required.acks=-1 on the producer.

      Here's what I think is happening:

      1. Producer writes message to broker 0, [EventServiceUpsertTopic,13], w/
      broker 0 currently leader, with ISR=(0), so write returns successfully,
      even when acks = -1. Correlation id 35836

      Producer log:

      [2014-07-24 14:44:26,991] [DEBUG] [dw-97 - PATCH
      [kafka.producer.BrokerPartitionInfo] Partition
      [EventServiceUpsertTopic,13] has leader 0

      [2014-07-24 14:44:26,993] [DEBUG] [dw-97 - PATCH
      [k.producer.async.DefaultEventHandler] Producer sent messages with
      correlation id 35836 for topics [EventServiceUpsertTopic,13] to broker 0 on
      2. Broker 1 is still catching up

      Broker 0 Log:

      [2014-07-24 14:44:26,992] [DEBUG] [kafka-request-handler-3]
      [kafka.cluster.Partition] Partition [EventServiceUpsertTopic,13] on broker
      0: Old hw for partition [EventServiceUpsertTopic,13] is 971. New hw is 971.
      All leo's are 975,971

      [2014-07-24 14:44:26,992] [DEBUG] [kafka-request-handler-3]
      [kafka.server.KafkaApis] [KafkaApi-0] Produce to local log in 0 ms

      [2014-07-24 14:44:26,992] [DEBUG] [kafka-processor-56821-0]
      [kafka.request.logger] Completed request:Name: ProducerRequest; Version:
      0; CorrelationId: 35836; ClientId: ; RequiredAcks: -1; AckTimeoutMs: 10000
      ms from client /
      3. Leader election is triggered by the scheduler:

      Broker 0 Log:

      [2014-07-24 14:44:26,991] [INFO ] [kafka-scheduler-0]
      [PreferredReplicaPartitionLeaderSelector]: Current leader 0 for partition [
      EventServiceUpsertTopic,13] is not the preferred replica. Trigerring
      preferred replica leader election

      [2014-07-24 14:44:26,993] [DEBUG] [kafka-scheduler-0]
      [kafka.utils.ZkUtils$] Conditional update of path
      /brokers/topics/EventServiceUpsertTopic/partitions/13/state with value

      Unknown macro: {"controller_epoch"}

      and expected version 3 succeeded, returning the new version: 4

      [2014-07-24 14:44:26,994] [DEBUG] [kafka-scheduler-0]
      [k.controller.PartitionStateMachine] [Partition state machine on
      Controller 0]: After leader election, leader cache is updated to

      [2014-07-24 14:44:26,994] [INFO ] [kafka-scheduler-0]
      [kafka.controller.KafkaController] [Controller 0]: Partition [
      EventServiceUpsertTopic,13] completed preferred replica leader election.
      New leader is 1
      4. Broker 1 is still behind, but it sets the high water mark to 971!!!

      Broker 1 Log:

      [2014-07-24 14:44:26,999] [INFO ] [kafka-request-handler-6]
      [kafka.server.ReplicaFetcherManager] [ReplicaFetcherManager on broker 1]
      Removed fetcher for partitions [EventServiceUpsertTopic,13]

      [2014-07-24 14:44:27,000] [DEBUG] [kafka-request-handler-6]
      [kafka.cluster.Partition] Partition [EventServiceUpsertTopic,13] on broker
      1: Old hw for partition [EventServiceUpsertTopic,13] is 970. New hw is -1.
      All leo's are -1,971

      [2014-07-24 14:44:27,098] [DEBUG] [kafka-request-handler-3]
      [kafka.server.KafkaApis] [KafkaApi-1] Maybe update partition HW due to
      fetch request: Name: FetchRequest; Version: 0; CorrelationId: 1; ClientId:
      ReplicaFetcherThread-0-1; ReplicaId: 0; MaxWait: 500 ms; MinBytes: 1 bytes;
      RequestInfo: [EventServiceUpsertTopic,13] ->
      PartitionFetchInfo(971,1048576), <Snipped>

      [2014-07-24 14:44:27,098] [DEBUG] [kafka-request-handler-3]
      [kafka.cluster.Partition] Partition [EventServiceUpsertTopic,13] on broker
      1: Recording follower 0 position 971 for partition [

      [2014-07-24 14:44:27,100] [DEBUG] [kafka-request-handler-3]
      [kafka.cluster.Partition] Partition [EventServiceUpsertTopic,13] on broker
      1: Highwatermark for partition [EventServiceUpsertTopic,13] updated to 971
      5. Consumer is none the wiser. All data that was in offsets 972-975 doesn't
      show up!

      I tried this with 2 initial replicas, and adding a 3rd which is supposed to
      be the leader for some new partitions, and this problem also happens there.
      The log on the old leader gets truncated to the offset on the new leader.
      What's the solution? Can I make a new broker leader for partitions that are
      currently active without losing data?



        1. producer.log
          7 kB
          Jad Naous
        2. consumer.log
          5 kB
          Jad Naous
        3. broker2.log
          27 kB
          Jad Naous
        4. broker0.log
          9 kB
          Jad Naous



            • Assignee:
              guozhang Guozhang Wang
              guozhang Guozhang Wang
            • Votes:
              0 Vote for this issue
              5 Start watching this issue


              • Created: