Description
The master detector code currently recursively walks the permanent node path (e.g. /home/mesos/cluster), attempting to create /home, then /home/mesos, then /home/mesos/cluster; trapping NoAuth and NodeExists errors along the way.
A simple performance improvement would be to perform a stat on the full permanent path and skip the recursive creation if it exists. Alternatively, it seems practical to forego node creation by the slave altogether, and have the slave commit suicide if the permanent node is not present.
The current behavior results in reams of zookeeper server logging when there is a master failover, as user errors are logged by default (which we'd prefer to retain). I'm not sure what the performance impact this has on ZooKeeper, but it's non-zero and a fix seems trivial.
Attachments
Issue Links
- is related to
-
MESOS-229 mesos zookeeper group code fails to connect when pre-existing children of the group path are read-only
- Resolved