Details
Description
When running org.apache.hadoop.hbase.tool.Canary with args -zookeeper -treatFailureAsError, the Canary will try to get a znode from each ZooKeeper server in the ensemble. If any server is unavailable or unresponsive, the canary will exit with a failure code.
If we use the Canary to gauge server health, and alert accordingly, this can be too strict. For example, in a 5-node ZooKeeper cluster, having one node down is safe and expected in rolling upgrades/patches.
This is a request to allow the Canary to take another parameter
-permittedZookeeperFailures <N>
If N=1, in the 5-node ZooKeeper ensemble example, then the Canary will still pass if 4 ZooKeeper nodes are reachable, but fail if 3 or fewer are reachable.
(This is my first Jira posting... sorry if I messed anything up.)
Attachments
Attachments
Issue Links
- is a clone of
-
HBASE-21126 Add ability for HBase Canary to ignore a configurable number of ZooKeeper down nodes
- Resolved
- links to