Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Fixed
-
0.90.0
-
None
-
None
-
Reviewed
Description
Keeping the servers in DeadServer until it reaches some maximum isn't super friendly, it confuses even the best of our users:
09:27 < gbowyer> Hi all, I have apparently three dead RS in my cluster, I cannot find references to them in HDFS or in ZK, how do I still report dead RS
09:27 < gbowyer> also the same nodes are reported as live region servers
The subtil startcode difference can be hard to catch, also this behavior differs from 0.20 (so old users get confused, like I did when debugging this problem) and it also differs from Hadoop's handling of dead DataNodes. It was introduced in HBASE-3282.
I think this should be improved by doing like Hadoop does, removing the RS from DeadServers when a new instance with the same hostname+port checks in. Stack says we should do it in ServerManager.checkIsDead
Attachments
Attachments
Issue Links
- relates to
-
HBASE-4359 Show dead RegionServer names in the HMaster info page
- Closed