Affects Version/s: 1.1.1, 2.0.0
Fix Version/s: None
- When 1 of my 3 brokers is cleanly shut down, consumption and production continues as normal due to replication. (Consumers are rebalanced to the replicas, and producers are rebalanced to the remaining brokers). However, when the cleanly-shut-down broker comes back, after about 10 minutes, a flurry of production errors occur and my consumers suddenly go back in time 2 weeks, causing a long outage (12 hours+) as all messages are replayed on some topics.
- The hypothesis is that the auto-leadership-rebalance is happening too quickly after the downed broker returns, before it has had a chance to become fully synchronised on all partitions. In particular, it seems that having consumer offets ahead of the most recent data on the topic that consumer was following causes the consumer to be reset to 0.
- bringing a node back from a clean shut down does not cause any consumers to reset to 0.
- I experience approximately 12 hours of partial outage triggered at the point that auto leadership rebalance occurs, after a cleanly shut down node returns.
- disable auto leadership rebalance entirely.
- manually rebalance it from time to time when all nodes and all partitions are fully replicated.
- Kafka deployment with 3 brokers and 2 topics.
- Replication factor is 3, for all topics.
- min.isr is 2, for all topics.
- Zookeeper deployment with 3 instances.
- In the region of 10 to 15 consumers, with 2 user topics (and, of course, the system topics such as consumer offsets). Consumer offsets has the standard 50 partitions. The user topics have about 3000 partitions in total.
- Offset retention time of 7 days, and topic retention time of 14 days.
- Input rate ~1000 messages/sec.
- Deployment happens to be on Google compute engine.
Related Stack Overflow Post:
It was suggested I open a ticket by "Muir" who says he they have also experienced this.
Transcription of logs, showing the problem:
Below, you can see chronologically sorted, interleaved, logs from the 3 brokers. prod-kafka-2 is the node which was cleanly shut down and then restarted. I filtered the messages only to those regardling __consumer_offsets-29 because it's just too much to paste, otherwise.
|Broker host||Broker ID|
|prod-kafka-2||1 (this one was restarted)|
prod-kafka-2: (just starting up)
prod-kafka-3: (sees replica1 come back)
prod-kafka-3: struggling to agree with prod-kafka-2. Kicks it out of ISR, but then fights with ZooKeeper. Perhaps 2 and 3 both think they're leader?
prod-kafka-2: rudely kicks BOTH of the other two replicas out of the ISR list, even though 2 is the one we just restarted and therefore is most likely behind. (Bear in mind that it already decided to truncate the topic to 0!)
prod-kafka-3: still fighting with zookeeper. Eventually loses.
prod-kafka-1: suddenly gets confused and also re-inits to 0, as prod-kafka-2 apparently becomes leader.
prod-kafka-3: finally decides that prod-kafka-2 is in charge, truncates accordingly
prod-kafka-1: leadership inauguration confirmed.
prod-kafka-2: completes its coup of consumer offsets, all is now 0.
Retrospectively, I realise that I have omitted any logs to do with leadership rebalancing. Nevertheless, as metioned before, it's consistently reproducible, and it's also easy to workaround by disabling leadership rebalance entirely.
kafka server.properties file