Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
1.13.0
Description
While running an internal performance testing scenario, we noticed a degradation of around 15% average between the time an entry is added to the server region and the time a client with registered CQs receives the onEvent listener callback.
The scenario itself uses two empty feeder members (DataPolicy.EMPTY) and 4 data members (DataPolicy.REPLICATE), there are also 8 regular clients with CQs registered on the servers. The feeders continuously insert/update custom objects into the region (the entries have a timestamp) and the clients measure the latency between the original timestamp and the one at which they receive the event through the CqListener.onEvent callback.
After troubleshooting the issue we were able to pinpoint a specific commit on which we start seeing the increase in latency:
commit e9993c15d88a5edd2a486fd64339deba37c24945 Author: Anthony Baker <abaker@apache.org> Date: Sat Mar 28 15:35:15 2020 -0700 GEODE-7765: Update dependencies for v1.13 Update many but not all dependencies.
The above commit is just an upgrade of several external dependencies, so we went ahead and executed the internal scenario using various combinations and reverting several dependencies to the "working" version until we found the one that's causing the issue: the upgrade of classgraph from version 4.8.52 to 4.8.68.
We've tried upgrading the dependency to the latest released version 4.8.78 and also increasing the memory to alleviate the extra garbage generated (this worked in the past for another degradation introduced by upgrading the same library) without luck, the degradation is still there.
Further troubleshooting demonstrated that the actual latency in our test is introduced when moving from classgraph-4.8.61 to classgraph-4.8.62, so the purpose of this ticket is to downgrade the library to version 4.8.61.
================================================================================ CLASSGRAPH 4.8.62 ================================================================================ TEST STATSPEC OP #0 #1 #2 #3 #4 #5 #6 #7 #8 63c681d217 e9993c15d8 e9993c15d8 + classgraph-4.8.62 ************** ---------------- ################# scale081 putResponseTime del --- --- -1.02 --- --- --- 1.01 1.01 --- putsPerSecond avg --- --- -1.02 --- -1.01 --- 1.01 --- -1.01 updateEventsPerSecond avg --- --- -1.02 --- --- --- --- --- --- updateLatency del --- --- -1.01 -1.15 -1.19 -1.18 -1.15 -1.13 -1.18 ================================================================================ --- = Statistic value is less than the ratio threshold +inf = Statistic value went from zero to non-zero or vice versa and this is good -inf = Statistic value went from zero to non-zero or vice versa and this is bad ================================================================================ ================================================================================ CLASSGRAPH 4.8.61 ================================================================================ TEST STATSPEC OP #0 #1 #2 #3 #4 #5 #6 #7 #8 63c681d217 e9993c15d8 e9993c15d8 + classgraph-4.8.61 ************** ---------------- ################# scale081 putResponseTime del --- --- -1.02 --- --- --- -1.03 --- --- putsPerSecond avg --- --- -1.02 --- -1.01 --- -1.03 -1.01 --- updateEventsPerSecond avg --- --- -1.02 --- --- --- -1.04 --- --- updateLatency del --- --- -1.01 -1.15 -1.19 -1.18 -1.01 --- --- ================================================================================ --- = Statistic value is less than the ratio threshold +inf = Statistic value went from zero to non-zero or vice versa and this is good -inf = Statistic value went from zero to non-zero or vice versa and this is bad ================================================================================
Attachments
Issue Links
- links to