Uploaded image for project: 'Kudu'
  1. Kudu
  2. KUDU-1317

Tablet re-replication is not well spread across a cluster



    • Type: Bug
    • Status: Resolved
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 0.7.0
    • Fix Version/s: 0.7.0
    • Component/s: master
    • Labels:


      This is an issue that Binglin Chang noticed on his ~70 node cluster. When a server hosting many tablets goes down, each of those tablets has to create new replicas elsewhere. We would expect in a 70-node cluster that all other nodes would participate in recovery in order to minimize the recovery time. However, we found that only a small number of nodes acted as 'sources' for making new tablet replicas.

      The issue is that the master currently assigns replicas in a strict round-robin. So, if we have a cluster with a number of TS which is a multiple of three, this means that we end up with servers


      having the same set of replicas, servers


      having another set, etc. So, if a server fails, only two servers can act as re-replication sources. If the number of servers is not a multiple of three, the problem is not quite as bad, but still limited to 4 (the two "adjacent" servers).

      The master should spread out the replicas more randomly so that when a server goes down, a large number of other servers can act as sources for re-replication.


        1. placement.py
          1.0 kB
          Todd Lipcon



            • Assignee:
              tlipcon Todd Lipcon
              tlipcon Todd Lipcon
            • Votes:
              0 Vote for this issue
              1 Start watching this issue


              • Created: