Details
-
Improvement
-
Status: Closed
-
Minor
-
Resolution: Fixed
-
0.95.2
-
None
-
Reviewed
-
-
0.96notable
Description
When a binary dies on a server, we don't try to restart it while it would be possible in most cases.
We can have something as:
loop
start
wait
if cleanStop then exit
if already stopped less than 5 minutes ago sleep 5 minute
endloop
This is simple for master & backup master, a little bit more complex for the region server as it can be stopped by a script or by the shutdown procedure.
On a long long term it could allow a restart with exactly the same assignments.
Attachments
Attachments
Issue Links
- is related to
-
HBASE-15523 enhance hbase-daemon.sh to enable autorestart.
- Reopened
- is required by
-
HBASE-5843 Improve HBase MTTR - Mean Time To Recover
- Closed