boolean maybeSleep(int messages, long averageGap, long maxCoalesceWindow, Parker parker)
maybeSleep() can enter an infinite loop if messages or averageGap ends up being 0 because sleep will be 0 and the while loop will never exit. I've noticed that on one of my clusters twice this week.
This can happen if in averageGap() sum is bigger than MEASURED_INTERVAL, which should be pretty rare but apparently happen to me.
Even if the diagnostic is wrong (and I'm pretty sure that this thread was using 100% CPU doing nothing), the fix seems pretty safe to apply.
diff --git a/src/java/org/apache/cassandra/utils/CoalescingStrategies.java b/src/java/org/apache/cassandra/utils/CoalescingStrategies.java
index 0aa980f..982d4a6 100644
@@ -100,7 +100,7 @@ public class CoalescingStrategies
long sleep = messages * averageGap;
- if (sleep > maxCoalesceWindow)
+ if (!sleep || sleep > maxCoalesceWindow)