Uploaded image for project: 'Kudu'
  1. Kudu
  2. KUDU-2948

ksck claims output is different but it’s actually the same

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 1.7.0
    • None
    • ksck
    • RHEL 7.7

    Description

      I came across this scenario where ksck reports the tablet has mismatched consensus and configs disagree with the master's but they do not seem to be:

      The consensus matrix is:
      Config source | Replicas | Current term | Config index | Committed?
      -------------------------------------------------------------
      master | A B* C | | | Yes
      A | A B* C | 2939 | 20571 | Yes
      B | A B* C | 2939 | 20571 | Yes
      C | A B* C | 2939 | 20571 | Yes
      Tablet 8137349615944d45a0897090d36d7a08 of table 'impala::<TableName>' is conflicted: Tablet 8137349615944d45a0897090d36d7a08 of table 'impala::<TableName>' replicas' active configs disagree with the master's
      6684505cec6f4a49b3442786cebdf06d (<serverFQDN>:7050): RUNNING
      8a3232953edd4ba79d20711d4ea3581d (<serverFQDN>:7050): RUNNING [LEADER]
      c8b68cb1366c45199668247d4d7c0295 (<serverFQDN>:7050): RUNNING

      All the peers reported by the master and tablet servers are:
      A = 6684505cec6f4a49b3442786cebdf06d
      B = 8a3232953edd4ba79d20711d4ea3581d
      C = c8b68cb1366c45199668247d4d7c0295

       

      The consensus matrix is:
      Config source | Replicas | Current term | Config index | Committed?
      -------------------------------------------------------------
      master | A B* C | | | Yes
      A | A B* C | 5030 | 6114195 | Yes
      B | A B* C | 5030 | 6114195 | Yes
      C | A B* C | 5030 | 6114195 | Yes
      Table impala::<TableName> has 1 tablet(s) with mismatched consensus

      1b6a44eadcd145f693390587c4e3308a (<serverFQDN>:7050): RUNNING
      36f1c8c3863a49778adeb6f40c73aa26 (<serverFQDN>:7050): RUNNING [LEADER]
      6c8fc8452f374672abdae3098a198101 (<serverFQDN>:7050): RUNNING

      0 replicas' active configs differ from the master's.
      All the peers reported by the master and tablet servers are:
      A = 1b6a44eadcd145f693390587c4e3308a
      B = 36f1c8c3863a49778adeb6f40c73aa26
      C = 6c8fc8452f374672abdae3098a198101

       

      Tablet servers are under high load - noticed many backpressure messages and the drives for data_dirs and wal_dis are encrypted using navencrypt.

      Attachments

        Activity

          People

            araina Ashwani Raina
            achennaka@cloudera.com Abhishek Chennaka
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated: