Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-13886

OOM put node in limbo

Agile BoardAttach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskConvert to sub-taskMoveLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Low
    • Resolution: Duplicate
    • Fix Version/s: None
    • Component/s: None
    • Labels:
    • Environment:

      Cassandra version 2.2.10

    • Severity:
      Low

      Description

      In one of our test clusters we have had some issues with OOM. While working on fixing this it was discovered that one of the nodes that got OOM actually wasn't shut down properly. Instead it went into a half-up-state where the affected node considered itself up while all other nodes considered it as down.

      The following stacktrace was observed which seems to be the cause of this:

      java.lang.NoClassDefFoundError: Could not initialize class java.lang.UNIXProcess
              at java.lang.ProcessImpl.start(ProcessImpl.java:130) ~[na:1.8.0_131]
              at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029) ~[na:1.8.0_131]
              at java.lang.Runtime.exec(Runtime.java:620) ~[na:1.8.0_131]
              at java.lang.Runtime.exec(Runtime.java:485) ~[na:1.8.0_131]
              at org.apache.cassandra.utils.HeapUtils.generateHeapDump(HeapUtils.java:88) ~[apache-cassandra-2.2.10.jar:2.2.10]
              at org.apache.cassandra.utils.JVMStabilityInspector.inspectThrowable(JVMStabilityInspector.java:56) ~[apache-cassandra-2.2.10.jar:2.2.10]
              at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:168) ~[apache-cassandra-2.2.10.jar:2.2.10]
              at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:136) ~[apache-cassandra-2.2.10.jar:2.2.10]
              at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) ~[apache-cassandra-2.2.10.jar:2.2.10]
              at java.lang.Thread.run(Thread.java:748) [na:1.8.0_131]
      

      It seems that if an unexpected exception/error is thrown inside JVMStabilityInspector.inspectThrowable the JVM is not actually shut down but instead keeps on running. My expectation is that the JVM should shut down in case OOM is thrown.

      Potential workaround is to add:

      JVM_OPTS="$JVM_OPTS -XX:+ExitOnOutOfMemoryError"
      

      to cassandra-env.sh.

        Attachments

        Issue Links

          Activity

            People

            • Assignee:
              tommy_s Tommy Stendahl Assign to me
              Reporter:
              molsson Marcus Olsson
              Authors:
              Tommy Stendahl

              Dates

              • Created:
                Updated:
                Resolved:

                Issue deployment