[KUDU-2915] Support to delete dead tservers from CLI - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 1.10.0
Fix Version/s: 1.16.0
Component/s: CLI, ops-tooling
Labels:
- supportability

Description

Sometimes the nodes in the cluster will crash due to machine problems such as disk corruption, which can be very common. However, if there are some dead tservers, ksck result will always show error (e.g. Not all Tablet Servers are reachable) although all tables have recovered to be healthy.

The only way now to get the healthy status of ksck is to restart all masters one by one. In some cases, for example, if the machine has completely corrupted, we hope to get healthy status of ksck without restarting, since after restarting masters the cluster will take some time to recover, during which it will have influence on scanning or upsetting to tables. The recovery time can be long which mainly depends on the scale of cluster. This problem can be serious and annoying especially tservers crashed with high-frequency in a large cluster.

It’s valuable if we have an easier way to delete dead tservers from master, I will support a kudu command to realize it.

Attachments

Activity

People

Assignee:: Hexin

Reporter:: Hexin

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 01/Aug/19 22:52

Updated:: 21/Feb/22 19:06

Resolved:: 21/Feb/22 19:06