When there are lots of WorkflowJobBean and CoordinatorJobBean instances that have to be followed up on creating SLASummaryBean instances, following can occur:
- we set oozie.sla.service.SLAService.capacity to a sane value like 10000 to preserve heap consumption
- SLACalculatorMemory#addRegistration() and SLACalculatorMemory#updateRegistration would:
- either emit TRACE level logs like SLA Registration Event - Job: showing the add / update of SLARegistrationBean was successful
- or emit ERROR level logs like SLACalculator memory capacity reached. Cannot add or update new SLA Registration entry for job showing the add / update of SLARegistrationBean was not successful
Since sometimes stale or already processed SLAEvent entries from SLACalculatorMemory#slaMap get removed, it's pretty hard to say what is its the actual size - that is, whether the next add or update command will succeed
We need an Instrumentation.Counter instance that gets incremented when there is an SLACalculatorMemory#slaMap#put() with a new entry added, and gets decremented when there happens a SLACalculatorMemory#slaMap#remove() with an existing entry removed. This counter will be automatically present within REST interface, and Oozie client.