Uploaded image for project: 'Kudu'
  1. Kudu
  2. KUDU-2915

Support to delete dead tservers from CLI

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.10.0
    • 1.16.0
    • CLI, ops-tooling

    Description

      Sometimes the nodes in the cluster will crash due to machine problems such as disk corruption, which can be very common. However, if there are some dead tservers, ksck result will always show error (e.g. Not all Tablet Servers are reachable) although all tables have recovered to be healthy.

      The only way now to get the healthy status of ksck is to restart all masters one by one. In some cases, for example, if the machine has completely corrupted, we hope to get healthy status of ksck without restarting, since after restarting masters the cluster will take some time to recover, during which it will have influence on scanning or upsetting to tables. The recovery time can be long which mainly depends on the scale of cluster. This problem can be serious and annoying especially tservers crashed with high-frequency in a large cluster.

      It’s valuable if we have an easier way to delete dead tservers from master, I will support a kudu command to realize it.

      Attachments

        Activity

          People

            Hexin Hexin
            He Xin Hexin
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: