Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-13931

Cassandra JVM stop itself randomly

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Normal
    • Resolution: Unresolved
    • None
    • Legacy/Core
    • None
    • RHEL 7.3
      JDK HotSpot 1.8.0_121-b13
      cassandra-3.11 cluster with 43 nodes in 9 datacenters
      8vCPU, 32 GB RAM

    • Normal

    Description

      Before I set -XX:MaxDirectMemorySize I receive OOM on OS level like;

      1. # grep "Out of" /var/log/messages-20170918
        Sep 16 06:54:07 p00skimnosql04 kernel: Out of memory: Kill process 26619 (java) score 287 or sacrifice child
        Sep 16 06:54:07 p00skimnosql04 kernel: Out of memory: Kill process 26640 (java) score 289 or sacrifice child

      If set -XX:MaxDirectMemorySize=5G limitation then periodicaly begin receive:
      HeapUtils.java:136 - Dumping heap to /egov/dumps/cassandra-1506868110-pid11155.hprof

      It seems like JVM kill itself when off-heap memory leaks occur.
      Typical errors in system.log before JVM begin dumping:

      ERROR [MessagingService-Incoming-/172.20.4.143] 2017-10-01 19:00:36,336 CassandraDaemon.java:228 - Exception in thread Thread[MessagingService-Incoming-/172.20.4.143,5,main]
      ERROR [Native-Transport-Requests-139] 2017-10-01 19:04:02,675 Message.java:625 - Unexpected exception during request; channel = [id: 0x3c0c1c26, L:/172.20.4.142:9042 - R:/172.20.4.139:44874]

      Full stack traces:

      ERROR [Native-Transport-Requests-139] 2017-10-01 19:04:02,675 Message.java:625 - Unexpected exception during request; channel = [id: 0x3c0c1c26, L:/172.20.4.142:9042 -
      R:/172.20.4.139:44874]
      java.lang.AssertionError: null
              at org.apache.cassandra.transport.ServerConnection.applyStateTransition(ServerConnection.java:97) ~[apache-cassandra-3.11.0.jar:3.11.0]
              at org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:521) [apache-cassandra-3.11.0.jar:3.11.0]
              at org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:410) [apache-cassandra-3.11.0.jar:3.11.0]
              at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105) [netty-all-4.0.44.Final.jar:4.0.44.Final]
              at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357) [netty-all-4.0.44.Final.jar:4.0.44.Final]
              at io.netty.channel.AbstractChannelHandlerContext.access$600(AbstractChannelHandlerContext.java:35) [netty-all-4.0.44.Final.jar:4.0.44.Final]
              at io.netty.channel.AbstractChannelHandlerContext$7.run(AbstractChannelHandlerContext.java:348) [netty-all-4.0.44.Final.jar:4.0.44.Final]
              at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_121]
              at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162) [apache-cassandra-3.11.0.jar:3.1
      1.0]
              at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) [apache-cassandra-3.11.0.jar:3.11.0]
              at java.lang.Thread.run(Thread.java:745) [na:1.8.0_121]
      
      INFO  [MutationStage-127] 2017-10-01 19:08:24,255 HeapUtils.java:136 - Dumping heap to /egov/dumps/cassandra-1506868110-pid11155.hprof ...
      Heap dump file created
      
      ERROR [MessagingService-Incoming-/172.20.4.143] 2017-10-01 19:08:33,493 CassandraDaemon.java:228 - Exception in thread Thread[MessagingService-Incoming-/172.20.4.143,5,main]
      java.io.IOError: java.io.EOFException: Stream ended prematurely
              at org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:227) ~[apache-cassandra-3.11.0.jar:3.11.0]
              at org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:215) ~[apache-cassandra-3.11.0.jar:3.11.0]
              at org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) ~[apache-cassandra-3.11.0.jar:3.11.0]
              at org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize30(PartitionUpdate.java:839) ~[apache-cassandra-3.11.0.jar:3.11.0]
              at org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize(PartitionUpdate.java:800) ~[apache-cassandra-3.11.0.jar:3.11.0]
              at org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:415) ~[apache-cassandra-3.11.0.jar:3.11.0]
              at org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:434) ~[apache-cassandra-3.11.0.jar:3.11.0]
              at org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:371) ~[apache-cassandra-3.11.0.jar:3.11.0]
              at org.apache.cassandra.net.MessageIn.read(MessageIn.java:123) ~[apache-cassandra-3.11.0.jar:3.11.0]
              at org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTcpConnection.java:192) ~[apache-cassandra-3.11.0.jar:3.11.0]
              at org.apache.cassandra.net.IncomingTcpConnection.receiveMessages(IncomingTcpConnection.java:180) ~[apache-cassandra-3.11.0.jar:3.11.0]
              at org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:94) ~[apache-cassandra-3.11.0.jar:3.11.0]
      Caused by: java.io.EOFException: Stream ended prematurely
              at net.jpountz.lz4.LZ4BlockInputStream.readFully(LZ4BlockInputStream.java:218) ~[lz4-1.3.0.jar:na]
              at net.jpountz.lz4.LZ4BlockInputStream.refill(LZ4BlockInputStream.java:150) ~[lz4-1.3.0.jar:na]
              at net.jpountz.lz4.LZ4BlockInputStream.read(LZ4BlockInputStream.java:117) ~[lz4-1.3.0.jar:na]
              at java.io.DataInputStream.readFully(DataInputStream.java:195) ~[na:1.8.0_121]
              at java.io.DataInputStream.readFully(DataInputStream.java:169) ~[na:1.8.0_121]
              at org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:402) ~[apache-cassandra-3.11.0.jar:3.11.0]
              at org.apache.cassandra.db.marshal.AbstractType.readValue(AbstractType.java:437) ~[apache-cassandra-3.11.0.jar:3.11.0]
              at org.apache.cassandra.db.rows.Cell$Serializer.deserialize(Cell.java:245) ~[apache-cassandra-3.11.0.jar:3.11.0]
              at org.apache.cassandra.db.rows.UnfilteredSerializer.readSimpleColumn(UnfilteredSerializer.java:639) ~[apache-cassandra-3.11.0.jar:3.11.0]
              at org.apache.cassandra.db.rows.UnfilteredSerializer.lambda$deserializeRowBody$1(UnfilteredSerializer.java:604) ~[apache-cassandra-3.11.0.jar:3.11.0]
              at org.apache.cassandra.utils.btree.BTree.applyForwards(BTree.java:1242) ~[apache-cassandra-3.11.0.jar:3.11.0]
              at org.apache.cassandra.utils.btree.BTree.apply(BTree.java:1197) ~[apache-cassandra-3.11.0.jar:3.11.0]
              at org.apache.cassandra.db.Columns.apply(Columns.java:377) ~[apache-cassandra-3.11.0.jar:3.11.0]
              at org.apache.cassandra.db.rows.UnfilteredSerializer.deserializeRowBody(UnfilteredSerializer.java:600) ~[apache-cassandra-3.11.0.jar:3.11.0]
              at org.apache.cassandra.db.rows.UnfilteredSerializer.deserializeOne(UnfilteredSerializer.java:475) ~[apache-cassandra-3.11.0.jar:3.11.0]
              at org.apache.cassandra.db.rows.UnfilteredSerializer.deserialize(UnfilteredSerializer.java:431) ~[apache-cassandra-3.11.0.jar:3.11.0]
              at org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:222) ~[apache-cassandra-3.11.0.jar:3.11.0]
              ... 11 common frames omitted
      

      Also I try to set -XX:+ExplicitGCInvokesConcurrent on some other nodes but without success.

      Attachments

        1. cassandra-env.sh
          14 kB
          Andrey Lataev
        2. cassandra.yaml
          5 kB
          Andrey Lataev
        3. system.log.2017-10-01.zip
          10.91 MB
          Andrey Lataev

        Activity

          People

            Unassigned Unassigned
            Ljus Andrey Lataev
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated: