[KUDU-1170] Queue should reset all_replicated_opid when becoming LEADER - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Cannot Reproduce
Affects Version/s: Private Beta
Fix Version/s: 1.2.0
Component/s: consensus
Labels:
None

Target Version/s:

1.2.0

Description

Looking at the logs on a busy server, I see various cases like:

Queue going to LEADER mode. State: All replicated op: 10.6, Majority replicated op: 10.5,

I'm not sure if it's actually causing downstream problems, but definitely seems counter-intuitive. I think the issue is that in SetLeaderMode, we reset majority_replicated_op based on the committed index, but we don't reset all_replicated. I think it's possible that the all_replicated watermark in a previous term gets ahead of the committed index in the case that we hit the "cannot advance committed index until we've replicated something in our own term" or somesuch, but there may be some other race here.

Attachments

Issue Links

relates to

KUDU-1169 SIGILL when aborting a replaced operation from previous leader

Resolved

Activity

People

Assignee:: Todd Lipcon

Reporter:: Todd Lipcon

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 19/Sep/15 15:45

Updated:: 19/Dec/16 05:33

Resolved:: 19/Dec/16 05:33