Uploaded image for project: 'Kudu'
  1. Kudu
  2. KUDU-2287

Add replica metric tracking time since there was a valid leader

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.7.0
    • 1.8.0
    • ksck, metrics, supportability
    • None

    Description

      Currently monitoring systems can report that the Kudu cluster is perfectly healthy when in fact some tablet has gotten "stuck" with no leader (eg due to some network connectivity problem or a bug). If we exposed a numeric metric on a tablet indicating the time since a replica was healthy, or number of failed election attempts, etc, we could easily monitor for this case.

      Attachments

        Issue Links

          Activity

            People

              abukor Attila Bukor
              tlipcon Todd Lipcon
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: