[KAFKA-2143] Replicas get ahead of leader and fail - ASF JIRA

Attach files

Attach Screenshot

Voters

Watch issue

Watchers

Create sub-task

Link

Clone

Update Comment Author

Replace String in Comment

Update Comment Visibility

Delete Comments

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 0.8.2.1
Fix Version/s: 0.9.0.1
Component/s: replication
Labels:
None

Description

On a cluster of 6 nodes, we recently saw a case where a single under-replicated partition suddenly appeared, replication lag spiked, and network IO spiked. The cluster appeared to recover eventually on its own,

Looking at the logs, the thing which failed was partition 7 of the topic background_queue. It had an ISR of 1,4,3 and its leader at the time was 3. Here are the interesting log lines:

On node 3 (the leader):

[2015-04-23 16:50:05,879] ERROR [Replica Manager on Broker 3]: Error when processing fetch request for partition [background_queue,7] offset 3722949957 from follower with correlation id 148185816. Possible cause: Request for offset 3722949957 but we only have log segments in the range 3648049863 to 3722949955. (kafka.server.ReplicaManager)
[2015-04-23 16:50:05,879] ERROR [Replica Manager on Broker 3]: Error when processing fetch request for partition [background_queue,7] offset 3722949957 from follower with correlation id 156007054. Possible cause: Request for offset 3722949957 but we only have log segments in the range 3648049863 to 3722949955. (kafka.server.ReplicaManager)
[2015-04-23 16:50:13,960] INFO Partition [background_queue,7] on broker 3: Shrinking ISR for partition [background_queue,7] from 1,4,3 to 3 (kafka.cluster.Partition)

Note that both replicas suddenly asked for an offset ahead of the available offsets.

And on nodes 1 and 4 (the replicas) many occurrences of the following:

[2015-04-23 16:50:05,935] INFO Scheduling log segment 3648049863 for log background_queue-7 for deletion. (kafka.log.Log) (edited)

Based on my reading, this looks like the replicas somehow got ahead of the leader, asked for an invalid offset, got confused, and re-replicated the entire topic from scratch to recover (this matches our network graphs, which show 3 sending a bunch of data to 1 and 4).

Taking a stab in the dark at the cause, there appears to be a race condition where replicas can receive a new offset before the leader has committed it and is ready to replicate?

Attachments

Issue Links

Add Link

is duplicated by

KAFKA-2165 ReplicaFetcherThread: data loss on unknown exception

Resolved

Delete this link

Activity

Comment

This comment will be Viewable by All Users Viewable by All Users

Cancel

People

Assignee:: Jiangjie Qin

Reporter:: Evan Huus

Reviewer:: Gwen Shapira

Votes:: 8 Vote for this issue

Watchers:: 26 Start watching this issue

Dates

Created:: 23/Apr/15 18:17

Updated:: 10/Feb/17 05:02

Resolved:: 29/Jan/16 04:20

Agile

View on Board

Replicas get ahead of leader and fail

Details

Description

Attachments

Attachments

Issue Links

Activity

People

Dates

Agile

Slack

Issue deployment