Details
-
Bug
-
Status: Resolved
-
Urgent
-
Resolution: Fixed
-
None
-
Availability - Process Crash
-
Critical
-
Normal
-
User Report
-
All
-
None
-
Description
I haven't gotten to the root cause of this yet. Several 4.1 nodes have crashed in in production. I'm not sure if this is related to Paxos v2 or not, but it is enabled. offheap_objects also enabled.
I'm not sure if this affects 5.0, yet.
Most of the crashes don't have a stacktrace - they only reference this
Stack: [0x00007fabf4c34000,0x00007fabf4d34000], sp=0x00007fabf4d31f00, free space=1015k Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code) v ~StubRoutines::jint_disjoint_arraycopy
They all are in the ScheduledTasks thread.
However, one node does have this in the crash log:
--------------- T H R E A D --------------- Current thread (0x000078b375eac800): JavaThread "ScheduledTasks:1" daemon [_thread_in_Java, id=151791, stack(0x000078b34b780000,0x000078b34b880000)] Stack: [0x000078b34b780000,0x000078b34b880000], sp=0x000078b34b87c350, free space=1008k Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code) J 29467 c2 org.apache.cassandra.db.rows.AbstractCell.clone(Lorg/apache/cassandra/utils/memory/ByteBufferCloner;)Lorg/apache/cassandra/db/rows/Cell; (50 bytes) @ 0x000078b3dd40a42f [0x000078b3dd409de0+0x000000000000064f] J 17669 c2 org.apache.cassandra.db.rows.Cell.clone(Lorg/apache/cassandra/utils/memory/Cloner;)Lorg/apache/cassandra/db/rows/ColumnData; (6 bytes) @ 0x000078b3dc54edc0 [0x000078b3dc54ed40+0x0000000000000080] J 17816 c2 org.apache.cassandra.db.rows.BTreeRow$$Lambda$845.apply(Ljava/lang/Object;)Ljava/lang/Object; (12 bytes) @ 0x000078b3dbed01a4 [0x000078b3dbed0120+0x0000000000000084] J 17828 c2 org.apache.cassandra.utils.btree.BTree.transform([Ljava/lang/Object;Ljava/util/function/Function;)[Ljava/lang/Object; (194 bytes) @ 0x000078b3dc5f35f0 [0x000078b3dc5f34a0+0x0000000000000150] J 35096 c2 org.apache.cassandra.db.rows.BTreeRow.clone(Lorg/apache/cassandra/utils/memory/Cloner;)Lorg/apache/cassandra/db/rows/Row; (37 bytes) @ 0x000078b3dda9111c [0x000078b3dda90fe0+0x000000000000013c] J 30500 c2 org.apache.cassandra.utils.memory.EnsureOnHeap$CloneToHeap.applyToRow(Lorg/apache/cassandra/db/rows/Row;)Lorg/apache/cassandra/db/rows/Row; (16 bytes) @ 0x000078b3dd59b91c [0x000078b3dd59b8c0+0x000000000000005c] J 26498 c2 org.apache.cassandra.db.transform.BaseRows.hasNext()Z (215 bytes) @ 0x000078b3dcf1c454 [0x000078b3dcf1c180+0x00000000000002d4] J 30775 c2 org.apache.cassandra.utils.MergeIterator$OneToOne.computeNext()Ljava/lang/Object; (49 bytes) @ 0x000078b3dc789020 [0x000078b3dc788fc0+0x0000000000000060] J 9082 c2 org.apache.cassandra.utils.AbstractIterator.hasNext()Z (80 bytes) @ 0x000078b3dbb3c544 [0x000078b3dbb3c440+0x0000000000000104] J 35593 c2 org.apache.cassandra.service.paxos.uncommitted.PaxosRows$PaxosMemtableToKeyStateIterator.computeNext()Lorg/apache/cassandra/service/paxos/uncommitted/PaxosKeyState; (126 bytes) @ 0x000078b3dc7ceeec [0x000078b3dc7cee20+0x00000000000000cc] J 35591 c2 org.apache.cassandra.service.paxos.uncommitted.PaxosRows$PaxosMemtableToKeyStateIterator.computeNext()Ljava/lang/Object; (5 bytes) @ 0x000078b3dc7d09e4 [0x000078b3dc7d09a0+0x0000000000000044] J 9082 c2 org.apache.cassandra.utils.AbstractIterator.hasNext()Z (80 bytes) @ 0x000078b3dbb3c544 [0x000078b3dbb3c440+0x0000000000000104] J 34146 c2 com.google.common.collect.Iterators.addAll(Ljava/util/Collection;Ljava/util/Iterator;)Z (41 bytes) @ 0x000078b3dd9197e8 [0x000078b3dd919680+0x0000000000000168] J 38256 c1 org.apache.cassandra.service.paxos.uncommitted.PaxosRows.toIterator(Lorg/apache/cassandra/db/partitions/UnfilteredPartitionIterator;Lorg/apache/cassandra/schema/TableId;Z)Lorg/apache/cassandra/utils/CloseableIterator; (49 bytes) @ 0x000078b3d6b677ac [0x000078b3d6b672e0+0x00000000000004cc] J 34823 c1 org.apache.cassandra.service.paxos.uncommitted.PaxosUncommittedIndex.repairIterator(Lorg/apache/cassandra/schema/TableId;Ljava/util/Collection;)Lorg/apache/cassandra/utils/CloseableIterator; (212 bytes) @ 0x000078b3d5675e0c [0x000078b3d5673be0+0x000000000000222c] J 38259 c1 org.apache.cassandra.service.paxos.uncommitted.PaxosUncommittedTracker.uncommittedKeyIterator(Lorg/apache/cassandra/schema/TableId;Ljava/util/Collection;)Lorg/apache/cassandra/utils/CloseableIterator; (116 bytes) @ 0x000078b3d6b6bc54 [0x000078b3d6b6b7e0+0x0000000000000474] J 38257 c1 org.apache.cassandra.service.StorageService.autoRepairPaxos(Lorg/apache/cassandra/schema/TableId;)Lorg/apache/cassandra/utils/concurrent/Future; (57 bytes) @ 0x000078b3d6b6902c [0x000078b3d6b68e00+0x000000000000022c] j org.apache.cassandra.service.paxos.uncommitted.PaxosUncommittedTracker.schedulePaxosAutoRepairs()V+146 j org.apache.cassandra.service.paxos.uncommitted.PaxosUncommittedTracker$$Lambda$1773.run()V+4 J 39703 c1 org.apache.cassandra.service.paxos.uncommitted.PaxosUncommittedTracker.runAndLogException(Ljava/lang/String;Ljava/lang/Runnable;)V (39 bytes) @ 0x000078b3d435adfc [0x000078b3d435ad00+0x00000000000000fc] j org.apache.cassandra.service.paxos.uncommitted.PaxosUncommittedTracker.maintenance()V+19 j org.apache.cassandra.service.paxos.uncommitted.PaxosUncommittedTracker$$Lambda$1534.run()V+4 J 30376 c2 java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run()V java.base@11.0.22 (57 bytes) @ 0x000078b3dd56543c [0x000078b3dd565100+0x000000000000033c] J 27255% c2 java.util.concurrent.ThreadPoolExecutor.runWorker(Ljava/util/concurrent/ThreadPoolExecutor$Worker;)V java.base@11.0.22 (187 bytes) @ 0x000078b3dd114d58 [0x000078b3dd114ac0+0x0000000000000298] j java.util.concurrent.ThreadPoolExecutor$Worker.run()V+5 java.base@11.0.22 j io.netty.util.concurrent.FastThreadLocalRunnable.run()V+4 j java.lang.Thread.run()V+11 java.base@11.0.22 v ~StubRoutines::call_stub V [libjvm.so+0x877453] JavaCalls::call_helper(JavaValue*, methodHandle const&, JavaCallArguments*, Thread*)+0x373 V [libjvm.so+0x875a96] JavaCalls::call_virtual(JavaValue*, Handle, Klass*, Symbol*, Symbol*, Thread*)+0x186 V [libjvm.so+0x925653] thread_entry(JavaThread*, Thread*)+0xa3 V [libjvm.so+0xe41391] JavaThread::thread_main_inner()+0x131 V [libjvm.so+0xe3d790] Thread::call_run()+0x140 V [libjvm.so+0xbf97de] thread_native_entry(Thread*)+0xee