Details
-
Bug
-
Status: Open
-
Normal
-
Resolution: Unresolved
-
None
-
None
-
RHEL 7.3
JDK HotSpot 1.8.0_121-b13
cassandra-3.11 cluster with 43 nodes in 9 datacenters
8vCPU, 32 GB RAM
-
Normal
Description
Before I set -XX:MaxDirectMemorySize I receive OOM on OS level like;
- # grep "Out of" /var/log/messages-20170918
Sep 16 06:54:07 p00skimnosql04 kernel: Out of memory: Kill process 26619 (java) score 287 or sacrifice child
Sep 16 06:54:07 p00skimnosql04 kernel: Out of memory: Kill process 26640 (java) score 289 or sacrifice child
If set -XX:MaxDirectMemorySize=5G limitation then periodicaly begin receive:
HeapUtils.java:136 - Dumping heap to /egov/dumps/cassandra-1506868110-pid11155.hprof
It seems like JVM kill itself when off-heap memory leaks occur.
Typical errors in system.log before JVM begin dumping:
ERROR [MessagingService-Incoming-/172.20.4.143] 2017-10-01 19:00:36,336 CassandraDaemon.java:228 - Exception in thread Thread[MessagingService-Incoming-/172.20.4.143,5,main]
ERROR [Native-Transport-Requests-139] 2017-10-01 19:04:02,675 Message.java:625 - Unexpected exception during request; channel = [id: 0x3c0c1c26, L:/172.20.4.142:9042 - R:/172.20.4.139:44874]
Full stack traces:
ERROR [Native-Transport-Requests-139] 2017-10-01 19:04:02,675 Message.java:625 - Unexpected exception during request; channel = [id: 0x3c0c1c26, L:/172.20.4.142:9042 - R:/172.20.4.139:44874] java.lang.AssertionError: null at org.apache.cassandra.transport.ServerConnection.applyStateTransition(ServerConnection.java:97) ~[apache-cassandra-3.11.0.jar:3.11.0] at org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:521) [apache-cassandra-3.11.0.jar:3.11.0] at org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:410) [apache-cassandra-3.11.0.jar:3.11.0] at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105) [netty-all-4.0.44.Final.jar:4.0.44.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357) [netty-all-4.0.44.Final.jar:4.0.44.Final] at io.netty.channel.AbstractChannelHandlerContext.access$600(AbstractChannelHandlerContext.java:35) [netty-all-4.0.44.Final.jar:4.0.44.Final] at io.netty.channel.AbstractChannelHandlerContext$7.run(AbstractChannelHandlerContext.java:348) [netty-all-4.0.44.Final.jar:4.0.44.Final] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_121] at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162) [apache-cassandra-3.11.0.jar:3.1 1.0] at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) [apache-cassandra-3.11.0.jar:3.11.0] at java.lang.Thread.run(Thread.java:745) [na:1.8.0_121]
INFO [MutationStage-127] 2017-10-01 19:08:24,255 HeapUtils.java:136 - Dumping heap to /egov/dumps/cassandra-1506868110-pid11155.hprof ... Heap dump file created
ERROR [MessagingService-Incoming-/172.20.4.143] 2017-10-01 19:08:33,493 CassandraDaemon.java:228 - Exception in thread Thread[MessagingService-Incoming-/172.20.4.143,5,main]
java.io.IOError: java.io.EOFException: Stream ended prematurely
at org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:227) ~[apache-cassandra-3.11.0.jar:3.11.0]
at org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:215) ~[apache-cassandra-3.11.0.jar:3.11.0]
at org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) ~[apache-cassandra-3.11.0.jar:3.11.0]
at org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize30(PartitionUpdate.java:839) ~[apache-cassandra-3.11.0.jar:3.11.0]
at org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize(PartitionUpdate.java:800) ~[apache-cassandra-3.11.0.jar:3.11.0]
at org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:415) ~[apache-cassandra-3.11.0.jar:3.11.0]
at org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:434) ~[apache-cassandra-3.11.0.jar:3.11.0]
at org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:371) ~[apache-cassandra-3.11.0.jar:3.11.0]
at org.apache.cassandra.net.MessageIn.read(MessageIn.java:123) ~[apache-cassandra-3.11.0.jar:3.11.0]
at org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTcpConnection.java:192) ~[apache-cassandra-3.11.0.jar:3.11.0]
at org.apache.cassandra.net.IncomingTcpConnection.receiveMessages(IncomingTcpConnection.java:180) ~[apache-cassandra-3.11.0.jar:3.11.0]
at org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:94) ~[apache-cassandra-3.11.0.jar:3.11.0]
Caused by: java.io.EOFException: Stream ended prematurely
at net.jpountz.lz4.LZ4BlockInputStream.readFully(LZ4BlockInputStream.java:218) ~[lz4-1.3.0.jar:na]
at net.jpountz.lz4.LZ4BlockInputStream.refill(LZ4BlockInputStream.java:150) ~[lz4-1.3.0.jar:na]
at net.jpountz.lz4.LZ4BlockInputStream.read(LZ4BlockInputStream.java:117) ~[lz4-1.3.0.jar:na]
at java.io.DataInputStream.readFully(DataInputStream.java:195) ~[na:1.8.0_121]
at java.io.DataInputStream.readFully(DataInputStream.java:169) ~[na:1.8.0_121]
at org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:402) ~[apache-cassandra-3.11.0.jar:3.11.0]
at org.apache.cassandra.db.marshal.AbstractType.readValue(AbstractType.java:437) ~[apache-cassandra-3.11.0.jar:3.11.0]
at org.apache.cassandra.db.rows.Cell$Serializer.deserialize(Cell.java:245) ~[apache-cassandra-3.11.0.jar:3.11.0]
at org.apache.cassandra.db.rows.UnfilteredSerializer.readSimpleColumn(UnfilteredSerializer.java:639) ~[apache-cassandra-3.11.0.jar:3.11.0]
at org.apache.cassandra.db.rows.UnfilteredSerializer.lambda$deserializeRowBody$1(UnfilteredSerializer.java:604) ~[apache-cassandra-3.11.0.jar:3.11.0]
at org.apache.cassandra.utils.btree.BTree.applyForwards(BTree.java:1242) ~[apache-cassandra-3.11.0.jar:3.11.0]
at org.apache.cassandra.utils.btree.BTree.apply(BTree.java:1197) ~[apache-cassandra-3.11.0.jar:3.11.0]
at org.apache.cassandra.db.Columns.apply(Columns.java:377) ~[apache-cassandra-3.11.0.jar:3.11.0]
at org.apache.cassandra.db.rows.UnfilteredSerializer.deserializeRowBody(UnfilteredSerializer.java:600) ~[apache-cassandra-3.11.0.jar:3.11.0]
at org.apache.cassandra.db.rows.UnfilteredSerializer.deserializeOne(UnfilteredSerializer.java:475) ~[apache-cassandra-3.11.0.jar:3.11.0]
at org.apache.cassandra.db.rows.UnfilteredSerializer.deserialize(UnfilteredSerializer.java:431) ~[apache-cassandra-3.11.0.jar:3.11.0]
at org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:222) ~[apache-cassandra-3.11.0.jar:3.11.0]
... 11 common frames omitted
Also I try to set -XX:+ExplicitGCInvokesConcurrent on some other nodes but without success.