Details
-
New Feature
-
Status: Closed
-
Minor
-
Resolution: Won't Fix
-
0.18.0
-
None
-
None
Description
The Region Historian (see HBASE-533) is very useful for debugging issues on the cluster involving region splitting, assignment, etc. It would be additionally useful if the master could keep a separate history of regionservers, when they:
- start up and report in
- quiesce/exit when the master tells them to
- fail (and report error?) and exit
- are declared dead after their lease expires
- are assigned a region (some overlap with Region Historian but is a different view)
- are asked to close a region (some overlap with Region Historian but is a different view)
Maybe call it a Service Historian?
There should be event logs per regionserver identity, available even if a regionserver is offline. The logs can have a simple structure: Timestamp, Event, Description, like the Region Historian tables.
Otherwise it is still necessary to comb through logs to determine if a regionserver was flaky during a period of time.
Additionally, if regionservers can send an error string when they abort and restart, such that the errors can be viewed in a service history table, that would be really helpful.
Hyperlinks in the service history table would make it easy to follow a table and its regions over the lifetime of the system, a reconstruction essentially of the client view of the cluster over time.
Attachments
Issue Links
- is blocked by
-
HBASE-546 Use Zookeeper in HBase
- Closed