Uploaded image for project: 'Kudu'
  1. Kudu
  2. KUDU-1170

Queue should reset all_replicated_opid when becoming LEADER

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Cannot Reproduce
    • Affects Version/s: Private Beta
    • Fix Version/s: 1.2.0
    • Component/s: consensus
    • Labels:
      None
    • Target Version/s:

      Description

      Looking at the logs on a busy server, I see various cases like:

      Queue going to LEADER mode. State: All replicated op: 10.6, Majority replicated op: 10.5,
      

      I'm not sure if it's actually causing downstream problems, but definitely seems counter-intuitive. I think the issue is that in SetLeaderMode, we reset majority_replicated_op based on the committed index, but we don't reset all_replicated. I think it's possible that the all_replicated watermark in a previous term gets ahead of the committed index in the case that we hit the "cannot advance committed index until we've replicated something in our own term" or somesuch, but there may be some other race here.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                tlipcon Todd Lipcon
                Reporter:
                tlipcon Todd Lipcon
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: