Details
-
Bug
-
Status: Resolved
-
Normal
-
Resolution: Fixed
-
None
-
None
-
Normal
Description
The current default for saving key caches is every hour. Additionally the default timeout for flushing memtables is every hour. I've seen situations where both of these occuring at the same time every hour causes enough pressure on the node to have it drop messages and other nodes mark it dead. This happens across the cluster and results in flapping.
We should do something to spread this out. Perhaps staggering cache saves/flushes that occur due to timeouts.