Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
2.3.0
-
None
-
None
Description
kafka claims it can be used as a storage. But following scenario proves other wise.
- Consider a topic with single partition, repl-factor 2, with two brokers, say A and B.... with A is the leader.
- Broker B fails due to sector errors. Sysadmin fixes the issues and brings it up again after a few minutes. A few log segments are lost/corrupted.
- Broker B catches up with missed out msgs by fetching them from the leader A, but does not realize the issue with earlier log segments.
- Broker A fails, B becomes the leader.
- A new consumer requests msgs from the beginning. Broker B fails to deliver msgs belonging to corrupted log segments.
Suggested solution
A broker, immediately after coming up, should go through a sanity check, e.g. CRC check of its log segments. Any corrupted/lost, should be refetched from the leader.