Details
-
Bug
-
Status: Resolved
-
Normal
-
Resolution: Fixed
-
None
-
Normal
Description
Different to CASSANDRA-13754, there seems to be another memory leak in 3.11.0+ in BTree$Builder / io.netty.util.Recycler$Stack.
- heap utilization increase after upgrading to 3.11.0 => cassandra_3.11.0_min_memory_utilization.jpg
- No difference after upgrading to 3.11.1 (snapshot build) => cassandra_3.11.1_snapshot_heaputilization.png; thus most likely after fixing
CASSANDRA-13754, more visible now - MAT shows io.netty.util.Recycler$Stack as top contributing class => cassandra_3.11.1_mat_dominator_classes.png
- With -Xmx8G (CMS) and our load pattern, we have to do a rolling restart after ~ 72 hours
Verified the following fix, namely explicitly unreferencing the recycleHandle member (making it non-final). In org.apache.cassandra.utils.btree.BTree.Builder.recycle()
public void recycle() { if (recycleHandle != null) { this.cleanup(); builderRecycler.recycle(this, recycleHandle); recycleHandle = null; // ADDED } }
Patched a single node in our loadtest cluster with this change and after ~ 10 hours uptime, no sign of the previously offending class in MAT anymore => cassandra_3.11.1_mat_dominator_classes_FIXED.png
Can' say if this has any other side effects etc., but I doubt.
Attachments
Attachments
Issue Links
- is related to
-
CASSANDRA-15430 Cassandra 3.0.18: BatchMessage.execute - 10x more on-heap allocations compared to 2.1.18
- Resolved
- relates to
-
CASSANDRA-9766 Faster Streaming
- Resolved