Uploaded image for project: 'Kudu'
  1. Kudu
  2. KUDU-1585

Leader should back off consensus request batch sizes after follower throttling kicks in

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 0.10.0
    • Fix Version/s: None
    • Component/s: consensus
    • Labels:
      None

      Description

      I've pushed a cluster to the point of overload such that most of the servers are heavily in the memory limit zone, and completing writes at a very low rate (only a few per second). In some cases, I see a leader which is sending batches of 70+ operations to a follower, which is rejecting the whole batch most of the time. Occasionally it accepts one or two of the operations. Nonetheless, every second, the leader is sending several MB worth of data to the follower.

      The overall network traffic on this cluster is quite high despite making very little progress, and I'm guessing most of it is these sorts of retries.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                tlipcon Todd Lipcon
              • Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Dates

                • Created:
                  Updated: