Uploaded image for project: 'Kudu'
  1. Kudu
  2. KUDU-2800

Avoid 'unintended' re-replication of long-bootstrapping tablet replicas



    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Invalid
    • 1.7.0, 1.8.0, 1.7.1, 1.9.0, 1.9.1, 1.10.0
    • 1.11.0, 1.11.1
    • consensus, tserver


      As implemented in
      https://github.com/apache/kudu/blob/10ea0ce5a636a050a1207f7ab5ecf63d178683f5/src/kudu/consensus/consensus_queue.cc#L576 , the logic for tracking 'health' of tablet replicas cannot differentiate between bootstrapping and failed replicas.

      As a result, if a tablet replica is bootstrapping for times longer than the interval specified by --follower_unavailable_considered_failed_sec run-time flag, the system can start the process of re-replication of the tablet replica elsewhere.

      One option might be sending a specific error with ConsensusResponsePB in response to a Raft message sent by a leader replica, maybe adding extra information on the current progress of the replica bootstrap process. As soon as such bootstrapping follower replica isn't failing behind leader's WAL GC threshold, the leader replica will not evict it. But if the bootstrapping follower replica falls behind the WAL GC threshold, leader replica will evict it and the system will start re-replicating it elsewhere. In cases when the amount of Raft transactions for a tablet is low, this approach would allow for longer bootstrapping times of tablet replicas. That might be especially beneficial in cases when a tablet server with IO-heavy tablet replicas is being restarted, and there aren't many incoming updates/inserts for tablets hosted by the tablet server.

      However, the approach above requires the Raft consensus object for a bootstrapping replica to be at least partially functional, so it entails reading at least some information about a replica from the on-disk consensus metadata prior to proper bootstrapping of a tablet replica by a tablet server.




            vladimir_committer Vladimir Verjovkin
            aserbin Alexey Serbin
            0 Vote for this issue
            5 Start watching this issue