Uploaded image for project: 'Kudu'
  1. Kudu
  2. KUDU-2287

Add replica metric tracking time since there was a valid leader

    XMLWordPrintableJSON

    Details

    • Type: New Feature
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.7.0
    • Fix Version/s: 1.8.0
    • Component/s: ksck, metrics, supportability
    • Labels:
      None

      Description

      Currently monitoring systems can report that the Kudu cluster is perfectly healthy when in fact some tablet has gotten "stuck" with no leader (eg due to some network connectivity problem or a bug). If we exposed a numeric metric on a tablet indicating the time since a replica was healthy, or number of failed election attempts, etc, we could easily monitor for this case.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                abukor Attila Bukor
                Reporter:
                tlipcon Todd Lipcon
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: