Details
Description
Currently, when ksck captures a tablet's replicas in the process of Raft leader election, it might report the tablet as unavailable if the captured Raft configurations differ between
Below is an example of output from the kudu cluster ksck tool (version 1.8):
Tablet ab548e2415854f3b8bee49b59fb66d6b of table 'default.loadgen_auto_16ffa6f3c4c948459d3c86ab550d628d' is conflicted: Tablet ab548e2415854f3b8bee49b59fb66d6b of table 'default.loadgen_auto_16ffa6f3c4c948459d3c86ab550d628d' replicas' active configs disagree with the master's 1d707658cd6b4cdb9c58ec2a811c2c6e (quasar-swnlpg-4.vpc.cloudera.com:7050): RUNNING [LEADER] b0da2997f80447afbcb094456ac20fa6 (quasar-swnlpg-3.vpc.cloudera.com:7050): RUNNING b78b856b8a94446c9609325e66b9295f (quasar-swnlpg-2.vpc.cloudera.com:7050): RUNNING All reported replicas are: A = 1d707658cd6b4cdb9c58ec2a811c2c6e B = b0da2997f80447afbcb094456ac20fa6 C = b78b856b8a94446c9609325e66b9295f The consensus matrix is: Config source | Replicas | Current term | Config index | Committed? ---------------+--------------+--------------+--------------+------------ master | A* B C | | | Yes A | A B C | 2 | -1 | Yes B | A B C | 2 | -1 | Yes C | A* B C | 1 | -1 | Yes Summary by table Name | RF | Status | Total Tablets | Healthy | Recovering | Under-replicated | Unavailable -------------------------------------------------------+----+--------------------+---------------+---------+------------+------------------+------------- default.loadgen_auto_16ffa6f3c4c948459d3c86ab550d628d | 3 | CONSENSUS_MISMATCH | 8 | 7 | 0 | 0 | 1
Attachments
Issue Links
- causes
-
KUDU-3628 ksck_->CheckMasterConsensus() is flaky in TSAN build
- In Progress