[KAFKA-8001] AlterLogDirs: Fetch from future replica stalls when local replica becomes a leader - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Critical
Resolution: Duplicate
Affects Version/s: 2.1.0, 2.1.1
Fix Version/s: None
Component/s: core
Labels:
None

Description

With KIP-320, fetch from follower / future replica returns FENCED_LEADER_EPOCH if current leader epoch in the request is lower than the leader epoch known to the leader (or local replica in case of future replica fetching). In case of future replica fetching from the local replica, if local replica becomes the leader of the partition, the next fetch from future replica fails with FENCED_LEADER_EPOCH and fetching from future replica is stopped until the next leader change.

This is a further burden operationally because log dir movement also disables log cleaning, meaning the original compacted partition that the user may want to move continues to grow unbounded

Proposed solution: on local replica leader change, future replica should "become a follower" again, and go through the truncation phase. Or we could optimize it, and just update partition state of the future replica to reflect the updated current leader epoch.

Attachments

Issue Links

duplicates

KAFKA-9654 ReplicaAlterLogDirsThread can't be created again if the previous ReplicaAlterLogDirsThreadmeet encounters leader epoch error

Resolved

links to

GitHub Pull Request #6395

GitHub Pull Request #6839

GitHub Pull Request #6841

Activity

People

Assignee:: Jason Gustafson

Reporter:: Anna Povzner

Votes:: 0 Vote for this issue

Watchers:: 6 Start watching this issue

Dates

Created:: 26/Feb/19 00:50

Updated:: 16/Mar/20 16:38

Resolved:: 16/Mar/20 16:38