[KAFKA-2165] ReplicaFetcherThread: data loss on unknown exception - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Duplicate
Affects Version/s: 0.8.2.1
Fix Version/s: None
Component/s: None
Labels:
None

Flags:

Patch

Description

Sometimes in our cluster some replica gets out of the isr. Then broker redownloads the partition from the beginning. We got the following messages in logs:

# The leader:
[2015-03-25 11:11:07,796] ERROR [Replica Manager on Broker 21]: Error when processing fetch request for partition [topic,11] offset 54369274 from follower with correlation id 2634499. Possible cause: Request for offset 54369274 but we only have log segments in the range 49322124 to 54369273. (kafka.server.ReplicaManager)

# The follower:
[2015-03-25 11:11:08,816] WARN [ReplicaFetcherThread-0-21], Replica 31 for partition [topic,11] reset its fetch offset from 49322124 to current leader 21's start offset 49322124 (kafka.server.ReplicaFetcherThread)
[2015-03-25 11:11:08,816] ERROR [ReplicaFetcherThread-0-21], Current offset 54369274 for partition [topic,11] out of range; reset offset to 49322124 (kafka.server.ReplicaFetcherThread)

This occures because we update fetchOffset here and then try to process message.
If any exception except OffsetOutOfRangeCode occures we get unsynchronized fetchOffset and replica.logEndOffset.
On next fetch iteration we can get fetchOffset>replica.logEndOffset==leaderEndOffset and OffsetOutOfRangeCode.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

KAFKA-2165.patch
02/May/15 20:20
2 kB
Alexey Ozeritskiy

Issue Links

duplicates

KAFKA-2143 Replicas get ahead of leader and fail

Resolved

is related to

KAFKA-2164 ReplicaFetcherThread: suspicious log message on reset offset

Resolved

Activity

People

Assignee:: Unassigned

Reporter:: Alexey Ozeritskiy

Reviewer:: Jun Rao

Votes:: 3 Vote for this issue

Watchers:: 11 Start watching this issue

Dates

Created:: 02/May/15 20:19

Updated:: 14/Mar/16 09:12

Resolved:: 14/Mar/16 09:10