Uploaded image for project: 'Kafka'
  1. Kafka
  2. KAFKA-7414

Do not fail broker on out of range offsets in replica fetcher

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.1.2, 2.0.1, 2.1.0
    • Component/s: replication
    • Labels:
      None

      Description

      In the replica fetcher, we have logic to detect the case when the follower's offset is ahead of the leader's. If unclean leader election is not enabled, we raise a fatal error and kill the broker.

      This behavior is inconsistent depending on the message format. With KIP-101/KIP-279, upon becoming a follower, the replica would use leader epoch information to reconcile the end of the log with the leader and simply truncate. Additionally, with the old format, the check is not really bulletproof for detecting data loss since the unclean leader's end offset might have already caught up to the follower's offset at the time of its initial fetch or when it queries for the current log end offset.

      To make the logic consistent, we could raise a fatal error whenever the follower has to truncate below the high watermark. However, the fatal error is probably overkill and it would be better to log a warning since most of the damage is already done if the leader has already been elected and this causes a huge blast radius.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                hachikuji Jason Gustafson
                Reporter:
                hachikuji Jason Gustafson
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: