Details
-
New Feature
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
-
Reviewed
Description
Currently if a regionserver is aborted due to fatal error or stopped by operator on purpose, it will be added into ServerManager#deadservers list and shown as "Dead Servers" in the master UI. This is a valid warn for operators to notice the self-aborted servers and give a sanity check to avoid further issues. However, after necessary checks, even if operator is sure that the node is decommissioned (such as for repair), there's no way to clear the dead server list except restarting master. See more details in this discussion in mail list
Here we propose to add a hbase shell command to allow clearing dead server list in ServerManager for advanced users, and the command should be executed with caution.
Attachments
Attachments
Issue Links
- is related to
-
HBASE-19131 Add the ClusterStatus hook and cleanup other hooks which can be replaced by ClusterStatus hook
- Closed
- relates to
-
HBASE-14223 Meta WALs are not cleared if meta region was closed and RS aborts
- Closed
- links to