Uploaded image for project: 'Kudu'
  1. Kudu
  2. KUDU-2709

ksck should not report short transient election states as problematic

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 1.7.0, 1.8.0, 1.7.1, 1.9.0
    • None
    • CLI, ksck
    • None

    Description

      Currently, when ksck captures a tablet's replicas in the process of Raft leader election, it might report the tablet as unavailable if the captured Raft configurations differ between

      Below is an example of output from the kudu cluster ksck tool (version 1.8):

      Tablet ab548e2415854f3b8bee49b59fb66d6b of table 'default.loadgen_auto_16ffa6f3c4c948459d3c86ab550d628d' is conflicted: Tablet ab548e2415854f3b8bee49b59fb66d6b of table 'default.loadgen_auto_16ffa6f3c4c948459d3c86ab550d628d' replicas' active configs disagree with the master's
        1d707658cd6b4cdb9c58ec2a811c2c6e (quasar-swnlpg-4.vpc.cloudera.com:7050): RUNNING [LEADER]
        b0da2997f80447afbcb094456ac20fa6 (quasar-swnlpg-3.vpc.cloudera.com:7050): RUNNING
        b78b856b8a94446c9609325e66b9295f (quasar-swnlpg-2.vpc.cloudera.com:7050): RUNNING
      All reported replicas are:
        A = 1d707658cd6b4cdb9c58ec2a811c2c6e
        B = b0da2997f80447afbcb094456ac20fa6
        C = b78b856b8a94446c9609325e66b9295f
      The consensus matrix is:
       Config source |   Replicas   | Current term | Config index | Committed?
      ---------------+--------------+--------------+--------------+------------
       master        | A*  B   C    |              |              | Yes
       A             | A   B   C    | 2            | -1           | Yes
       B             | A   B   C    | 2            | -1           | Yes
       C             | A*  B   C    | 1            | -1           | Yes
      
      Summary by table
                               Name                          | RF |       Status       | Total Tablets | Healthy | Recovering | Under-replicated | Unavailable
      -------------------------------------------------------+----+--------------------+---------------+---------+------------+------------------+-------------
       default.loadgen_auto_16ffa6f3c4c948459d3c86ab550d628d | 3  | CONSENSUS_MISMATCH | 8             | 7       | 0          | 0                | 1
      

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              aserbin Alexey Serbin
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated: