Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-977

Implement generation/term per leader to reconcile messages correctly

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.11.0.0
    • Component/s: None
    • Labels:
      None

      Description

      During unclean leader election, the log messages can diverge and when the followers come back up Kafka does not reconcile correctly. To implement it correctly, we need to add a term/generation to each message and use that to reconcile.

        Activity

        Hide
        sriramsub Sriram Subramanian added a comment -

        I would like to bring this issue to discussion again. Kafka is used a lot more now for use cases other than just moving data from point A to point B. For example, consider the case where Kafka acts as the log and materialized views are created by consuming these logs. In such scenarios, it is important that the logs are consistent and do not diverge even under unclean leader elections (Replaying these replicas should create the same view). Having a generation/term is essential for log replication and it would be great for Kafka to have the same guarantees as other log replication protocols. I would be happy to give more detailed examples for this but would want to know if we think this is an issue to address soon.

        Show
        sriramsub Sriram Subramanian added a comment - I would like to bring this issue to discussion again. Kafka is used a lot more now for use cases other than just moving data from point A to point B. For example, consider the case where Kafka acts as the log and materialized views are created by consuming these logs. In such scenarios, it is important that the logs are consistent and do not diverge even under unclean leader elections (Replaying these replicas should create the same view). Having a generation/term is essential for log replication and it would be great for Kafka to have the same guarantees as other log replication protocols. I would be happy to give more detailed examples for this but would want to know if we think this is an issue to address soon.
        Hide
        junrao Jun Rao added a comment -

        Sriram,

        There has been some discussion related to this in KAFKA-1211 as well. Yes, by using the leader generations per partition, we can (1) make sure replicas are consistent after unclean leader election; (2) make sure there is no data loss in the corner case discussed in KAFKA-1211 (i.e., another leader failure happens just after the follower has truncated the log, but before it has re-replicated existing committed data from the leader). The change potentially requires wire protocol and on-disk format change though. So, we need to think through how to do that in a backward compatible way.

        Show
        junrao Jun Rao added a comment - Sriram, There has been some discussion related to this in KAFKA-1211 as well. Yes, by using the leader generations per partition, we can (1) make sure replicas are consistent after unclean leader election; (2) make sure there is no data loss in the corner case discussed in KAFKA-1211 (i.e., another leader failure happens just after the follower has truncated the log, but before it has re-replicated existing committed data from the leader). The change potentially requires wire protocol and on-disk format change though. So, we need to think through how to do that in a backward compatible way.
        Hide
        ijuma Ismael Juma added a comment -

        Marking this as fixed since it seems to be the same as KIP-101/KAFKA-1211.

        Show
        ijuma Ismael Juma added a comment - Marking this as fixed since it seems to be the same as KIP-101/ KAFKA-1211 .
        Hide
        junrao Jun Rao added a comment -

        Just to clarify. KAFKA-1211 added leader epoch in message set to prevent data losses. However, it didn't address the log divergency issue due to unclean leader election. The complexity is mostly on compacted topics. When a log is compacted, it's possible for all messages in a given leader epoch to be deleted. Therefore, it's a bit tricky to fully reconcile the log when an unclean leader election happens. This is less an issue since unclean leader election will be turned off by default from 0.11.0.

        Show
        junrao Jun Rao added a comment - Just to clarify. KAFKA-1211 added leader epoch in message set to prevent data losses. However, it didn't address the log divergency issue due to unclean leader election. The complexity is mostly on compacted topics. When a log is compacted, it's possible for all messages in a given leader epoch to be deleted. Therefore, it's a bit tricky to fully reconcile the log when an unclean leader election happens. This is less an issue since unclean leader election will be turned off by default from 0.11.0.
        Hide
        ijuma Ismael Juma added a comment -

        Jun Rao, should we reopen this then? I believe there was at least one more JIRA for the log divergence issue.

        Show
        ijuma Ismael Juma added a comment - Jun Rao , should we reopen this then? I believe there was at least one more JIRA for the log divergence issue.

          People

          • Assignee:
            sriramsub Sriram Subramanian
            Reporter:
            sriramsub Sriram Subramanian
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development