Description
In oak we have a PerfLogger which can be used to log slow operations. It is used in a variety of places, most prominent:
- DocumentNodeStore
- MongoDocumentStore
- EventGenerator
- DocumentNodeStoreBranch
- AbstractDocumentNodeState
- NodeObserver
- and in several places in indexing code
- as well as in RDB case
Currently we use only the DEBUG thresholds, we're not using the INFO thresholds. That means, we only would see logs when the level is at DEBUG, we see nothing by default at INFO.
We could however consider logging very slow operations at INFO by default. Much like mongo logs requests slower than 100ms always, by default.
In AEM's case we might have many slow operations though, so choosing a low threshold would potentially flood the log and cause troubles that way.
Additionally, there isn't a (global) config that can be tweaked to set this threshold. So far it is all hard-coded - and the threshold is actually Long.MAX_VALUE (so essentially never).
Suggesting hereby to introduce such a config - perhaps it doens't have to be global, it could be for example covering a few classes. But having such a config, then having that tweakable via skyline-ops, could be a useful feature. We could then set for example a threshold of 60 seconds and monitor the amount of logs generated by that throughout Skyline. If the number is low enough, we can go lower to eg 30 sec or ideally eventually 10 sec. Always given we are not flooding splunk that way. But the prerequisite for something like that is. a skyline-ops configurable perflogger threshold (and wire that to the corresponding Perflogger.end() methods at places we want it to be.