Uploaded image for project: 'Kudu'
  1. Kudu
  2. KUDU-1317

Tablet re-replication is not well spread across a cluster

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 0.7.0
    • Fix Version/s: 0.7.0
    • Component/s: master
    • Labels:
      None

      Description

      This is an issue that Binglin Chang noticed on his ~70 node cluster. When a server hosting many tablets goes down, each of those tablets has to create new replicas elsewhere. We would expect in a 70-node cluster that all other nodes would participate in recovery in order to minimize the recovery time. However, we found that only a small number of nodes acted as 'sources' for making new tablet replicas.

      The issue is that the master currently assigns replicas in a strict round-robin. So, if we have a cluster with a number of TS which is a multiple of three, this means that we end up with servers

      {A,B,C}

      having the same set of replicas, servers

      {D,E,F}

      having another set, etc. So, if a server fails, only two servers can act as re-replication sources. If the number of servers is not a multiple of three, the problem is not quite as bad, but still limited to 4 (the two "adjacent" servers).

      The master should spread out the replicas more randomly so that when a server goes down, a large number of other servers can act as sources for re-replication.

        Attachments

        1. placement.py
          1.0 kB
          Todd Lipcon

          Activity

            People

            • Assignee:
              tlipcon Todd Lipcon
              Reporter:
              tlipcon Todd Lipcon
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: