Existing quota logic in ZooKeeper only used for keeping track of node count and byte usage per path. When “soft” limit is exceeded, the server log warning message. This is not sufficient for our operation requirement. Here is want we are planning to do. We already implemented majority of these functionalities except the hard limit.
1. Resource metric – The system will be able to monitor the following resource usage and enforce hard limit.
- node count
- used bytes
- write throughput (bytes/sec, transactions/sec)
- read throughput (bytes/sec, transactions/sec) (monitoring only, no hard limit)
2. Usage monitoring and soft-limit
The server is going to export per-path usage statistic via four-letter command. Since this is easier for external monitoring system to get these numbers than reading from ZooKeeper directly. For example, the new command can report the following stats
read_bytes.<path-A>.60 20 //Read byte/sec for the last 1 min
For read throughput and write throughput, all servers will report read throughput statistics but only the leader report write throughput statistic. Internally, we already used a high performance multi windows counters provided by Facebook’s jcommon (https://github.com/facebook/jcommon/blob/master/stats/src/main/java/com/facebook/stats/MultiWindowRate.java) However, I think the community may want a simpler counter to reduce the dependency requirement.
Additional, I am going to add an option to disable soft-limit check since writing warning message to log file is not that useful and may affect performance (especially when replying txnlog and soft limit is exceeded).
3. Hard limit
PrepRequestProcessor on the leader have to decide when to reject a given request (instead of the current patch that rejects the request down in DataTree). However, PrepRequestProcessor will need to access more data from the DataTree in order to decide when to reject a request. There are 2 possible implementations here. First, usage tracking and limit checking is implemented in the DataTree, this is simpler given the amount of information that is available in DataTree itself. The problem is that this limit checking logic will not be accurate when there are in-flight requests. The other option is to move limit checking to PrepRequestProcessor and maintain in-flight statistic similar to ChangeRecord.
I think the later option is much more complicate. Since it is ok to allow usage to exceed hard limit slightly, I might go with the first option.
Let me know if you have any suggestion.