Details
-
Bug
-
Status: Triage Needed
-
Normal
-
Resolution: Unresolved
-
None
-
None
-
All
-
None
Description
On CASSANDRA-7507 the drain shutdown hook is removed when the process hits an OutOfMemoryError, to avoid trying to do a clean shutdown when the node runs out of heap space.
On CASSANDRA-13006 cassandra started relying on JVM flags (ExitOnOutOfMemoryError and CrashOnOutOfMemoryError) to stop the node when hitting and OutOfMemoryError.
However, there are non-fatal OutOfMemoryErrors such as OutOfMemory: unable to create new native thread or OutOfMemory: map failed which do not cause the process to crash even with the ExitOnOutOfMemoryError or CrashOnOutOfMemoryError flags.
Since the shutdown hook is removed after non-fatal OutOfMemory errors, it's no longer possible to do a clean shutdown (via SIGTERM kill or nodetool stopdaemon).
I believe the intent of CASSANDRA-7507 was to remove the shutdown hook only on fatal OutOfMemoryErrors (such as Heap Space Exhausted), those causing the node to crash. If a node is kept running after an OutOfMemoryError, this should not prevent the node from being cleanly shutdown afterwards.
We should either make the JVM exit on any OutOfMemory error, or remove the drain shutdown hook only on fatal OutOfMemoryErrors, those that will cause the JVM to crash straight away.
Attachments
Issue Links
- is related to
-
CASSANDRA-13006 Disable automatic heap dumps on OOM error
- Resolved
-
CASSANDRA-7507 OOM creates unreliable state - die instantly better
- Resolved