Details

    • Type: Bug
    • Status: In Progress
    • Priority: Minor
    • Resolution: Unresolved
    • Fix Version/s: 3.0.15
    • Component/s: Configuration
    • Labels:
      None

      Description

      With CASSANDRA-9861, a change was added to enable collecting heap dumps by default if the process encountered an OOM error. These heap dumps are stored in the Apache Cassandra home directory unless configured otherwise (see Cassandra Support Document for this feature).
       
      The creation and storage of heap dumps aides debugging and investigative workflows, but is not be desirable for a production environment where these heap dumps may occupy a large amount of disk space and require manual intervention for cleanups. 
       
      Managing heap dumps on out of memory errors and configuring the paths for these heap dumps are available as JVM options in JVM. The current behavior conflicts with the Boolean JVM flag HeapDumpOnOutOfMemoryError. 
       
      A patch can be proposed here that would make the heap dump on OOM error honor the HeapDumpOnOutOfMemoryError flag. Users who would want to still generate heap dumps on OOM errors can set the -XX:+HeapDumpOnOutOfMemoryError JVM option.

        Activity

        Hide
        cnlwsu Chris Lohfink added a comment -

        I am -1 on this. With current compaction strategies production nodes should have 50-30% free disk space and a heap dump of 8gb is difference between being able to debug bad issues and not. Its like removing log files to save disk space. If anything making it a "rotating" heap dumpa to only have at most 5 or something sounds like a better idea (like log files).

        Show
        cnlwsu Chris Lohfink added a comment - I am -1 on this. With current compaction strategies production nodes should have 50-30% free disk space and a heap dump of 8gb is difference between being able to debug bad issues and not. Its like removing log files to save disk space. If anything making it a "rotating" heap dumpa to only have at most 5 or something sounds like a better idea (like log files).
        Hide
        blerer Benjamin Lerer added a comment - - edited

        anmols Out of curiosity, how often do your nodes dies of an OOM error?

        Show
        blerer Benjamin Lerer added a comment - - edited anmols Out of curiosity, how often do your nodes dies of an OOM error?
        Hide
        jjordan Jeremiah Jordan added a comment -

        Chris Lohfink This isn't proposing we remove heap dumps, just that we honor the JVM flags for making and storing them. I think that is probably a good idea.

        We should still default our flags to make them in env files, but users should have control over if they are to be written out, besides hacks like setting the output to a read only location.

        Show
        jjordan Jeremiah Jordan added a comment - Chris Lohfink This isn't proposing we remove heap dumps, just that we honor the JVM flags for making and storing them. I think that is probably a good idea. We should still default our flags to make them in env files, but users should have control over if they are to be written out, besides hacks like setting the output to a read only location.
        Hide
        blerer Benjamin Lerer added a comment -

        I also believe that the issue is legitimate as the code is already relying on the JVM argument for the location where the dumps must be generated. After I am not convinced that turning it off is a wise thing to do.

        Show
        blerer Benjamin Lerer added a comment - I also believe that the issue is legitimate as the code is already relying on the JVM argument for the location where the dumps must be generated. After I am not convinced that turning it off is a wise thing to do.
        Hide
        anmolsharma.141 anmols added a comment -

        Patch for Disabling Heap Dumps if JVM option HeapDumpOnOutOfMemoryError is not set.

        Show
        anmolsharma.141 anmols added a comment - Patch for Disabling Heap Dumps if JVM option HeapDumpOnOutOfMemoryError is not set.
        Hide
        anmolsharma.141 anmols added a comment -

        Disable heap dumps if JVM option HeapDumpOnOutOfMemoryError is not set.

        Show
        anmolsharma.141 anmols added a comment - Disable heap dumps if JVM option HeapDumpOnOutOfMemoryError is not set.
        Hide
        anmolsharma.141 anmols added a comment -

        Benjamin Lerer We encountered a condition in one of our clusters that resulted in repeated OOMs across many of the nodes. The reason for the OOMs is an issue (CASSANDRA-12796) that was causing heap exhaustion when rebuilding secondary indexes over wide partitions.

        Show
        anmolsharma.141 anmols added a comment - Benjamin Lerer We encountered a condition in one of our clusters that resulted in repeated OOMs across many of the nodes. The reason for the OOMs is an issue ( CASSANDRA-12796 ) that was causing heap exhaustion when rebuilding secondary indexes over wide partitions.
        Hide
        blerer Benjamin Lerer added a comment -

        One thing that I did not consider why working on CASSANDRA-9861 was that not everybody use the Oracle JVM. It is clear that some people use Zing and I guess that some people might also be using the IBM JVM. All of those JVM seems to provide jmap but they do not seems to use the same configuration arguments for handling OOM errors. I could not find the Zing options (Nitsan Wakart do you have any idea of where I could get them?) but the IBM JVM does not use the same options as the Oracle one.
        Moreover, the Oracle JVM support also options like -XX:OnOutOfMemoryError which we currently ignore.

        To be honest, I am not fully sure how we should handle all that. We could:

        1. try to determine which JVM is being used and check for the relevants options (knowing that it will never be fully bulletproof)
        2. ignore the JVM options and create our own configuration for it (knowing that it might be confusing for the administrators)

        Any opinions or suggestions?

        Show
        blerer Benjamin Lerer added a comment - One thing that I did not consider why working on CASSANDRA-9861 was that not everybody use the Oracle JVM. It is clear that some people use Zing and I guess that some people might also be using the IBM JVM. All of those JVM seems to provide jmap but they do not seems to use the same configuration arguments for handling OOM errors. I could not find the Zing options ( Nitsan Wakart do you have any idea of where I could get them?) but the IBM JVM does not use the same options as the Oracle one. Moreover, the Oracle JVM support also options like -XX:OnOutOfMemoryError which we currently ignore. To be honest, I am not fully sure how we should handle all that. We could: try to determine which JVM is being used and check for the relevants options (knowing that it will never be fully bulletproof) ignore the JVM options and create our own configuration for it (knowing that it might be confusing for the administrators) Any opinions or suggestions?
        Hide
        nitsanw Nitsan Wakart added a comment -

        OnOutOfMemoryError and HeapDumpOnOutOfMemoryError are both available on Zing.
        The default for HeapDumpOnOutOfMemoryError is false.
        Let me know if you need more details.

        Show
        nitsanw Nitsan Wakart added a comment - OnOutOfMemoryError and HeapDumpOnOutOfMemoryError are both available on Zing. The default for HeapDumpOnOutOfMemoryError is false. Let me know if you need more details.
        Hide
        anmolsharma.141 anmols added a comment -

        I have suggested a patch for this issue, can you please let me know if this is acceptable for this issue?

        Show
        anmolsharma.141 anmols added a comment - I have suggested a patch for this issue, can you please let me know if this is acceptable for this issue?
        Hide
        blerer Benjamin Lerer added a comment -

        Sorry, for the misunderstanding.
        Your patch fix only one part of the problem and I would rather fix all the problems at once than solving them in multiple tickets.

        Basically, the fix need to take into account the JVM being used and handle the processing of OOM errors accordingly. It also need to support options like -XX:OnOutOfMemoryError.
        Apparently, Oracle and Zing JVMs use the same options names. The IBM one seems to use different options.
        The patch should also log a clear error message if the JVM is not supported.

        If you are still interested in working on this issue, feel free to reasign it to yourself.

        Show
        blerer Benjamin Lerer added a comment - Sorry, for the misunderstanding. Your patch fix only one part of the problem and I would rather fix all the problems at once than solving them in multiple tickets. Basically, the fix need to take into account the JVM being used and handle the processing of OOM errors accordingly. It also need to support options like -XX:OnOutOfMemoryError . Apparently, Oracle and Zing JVMs use the same options names. The IBM one seems to use different options. The patch should also log a clear error message if the JVM is not supported. If you are still interested in working on this issue, feel free to reasign it to yourself.
        Hide
        nibin.gv Nibin G added a comment - - edited

        Oracle Java's JRE 8 and Server JRE 8 for linux environments are not shipping jmap anymore. That means, we have to use Oracle Java's JDK for the heap dumps to be generated from cassandra. And some of the security compliance won't permit the use of JDK in production.

        It would be great if an option is provided to disable heap dump from the application code[1]. So that JVM can generate the heap dump. Or use jcmd utility (available in server-jre 8 and jdk 8) instead of jmap.

        [1] https://github.com/apache/cassandra/blob/81f6c784ce967fadb6ed7f58de1328e713eaf53c/src/java/org/apache/cassandra/utils/JVMStabilityInspector.java#L56

        Show
        nibin.gv Nibin G added a comment - - edited Oracle Java's JRE 8 and Server JRE 8 for linux environments are not shipping jmap anymore. That means, we have to use Oracle Java's JDK for the heap dumps to be generated from cassandra. And some of the security compliance won't permit the use of JDK in production. It would be great if an option is provided to disable heap dump from the application code [1] . So that JVM can generate the heap dump. Or use jcmd utility (available in server-jre 8 and jdk 8) instead of jmap. [1] https://github.com/apache/cassandra/blob/81f6c784ce967fadb6ed7f58de1328e713eaf53c/src/java/org/apache/cassandra/utils/JVMStabilityInspector.java#L56
        Hide
        jjordan Jeremiah Jordan added a comment -

        We could fall back to trying to use the "com.sun.management:type=HotSpotDiagnostic" bean directly if we can't find jmap.

        Some links for doing this:
        https://blogs.oracle.com/sundararajan/entry/programmatically_dumping_heap_from_java
        http://stackoverflow.com/a/12297339/138693

        Show
        jjordan Jeremiah Jordan added a comment - We could fall back to trying to use the "com.sun.management:type=HotSpotDiagnostic" bean directly if we can't find jmap. Some links for doing this: https://blogs.oracle.com/sundararajan/entry/programmatically_dumping_heap_from_java http://stackoverflow.com/a/12297339/138693
        Hide
        nibin.gv Nibin G added a comment -

        Why can't we delegate the heap dump generation to JVM if jmap is not available in class path ? JRE can generate heap dump even if jmap is not there in the path.

        Show
        nibin.gv Nibin G added a comment - Why can't we delegate the heap dump generation to JVM if jmap is not available in class path ? JRE can generate heap dump even if jmap is not there in the path.
        Hide
        urandom Eric Evans added a comment -

        I think this behavior (invoking jmap on OOM) is a pretty serious violation to the Element of Least-surprise. We already provide mechanisms for passing arguments to the JVM, and TTBMK, all of them provide some means for dropping a heap dump on out-of-memory.

        It definitely caught me be surprise. We carried over -XX:+HeapDumpOnOutOfMemoryError from our 2.2.x environment, only to have Cassandra and the JVM racing to create a dump of the same name.

        Additionally, something about all of this is buggy, because on more than one occasion we've had Cassandra fork-bombing jmap processes

        ● cassandra-b.service - distributed storage system for structured data
           Loaded: loaded (/lib/systemd/system/cassandra-b.service; static)
           Active: active (running) since Sat 2017-08-05 22:32:07 UTC; 23h ago
         Main PID: 25025 (java)
           CGroup: /system.slice/cassandra-b.service
                   ├─ 9213 jmap -histo 25025
                   ├─ 9214 jmap -histo 25025
                   ├─ 9284 jmap -dump:format=b,file=/srv/cassandra-b/cassandra-1501972327-pid24937.hprof 25025
                   ├─ 9285 jmap -dump:format=b,file=/srv/cassandra-b/cassandra-1501972327-pid24937.hprof 25025
                   ├─ 9388 jmap -dump:format=b,file=/srv/cassandra-b/cassandra-1501972327-pid24937.hprof 25025
                   ├─ 9453 jmap -dump:format=b,file=/srv/cassandra-b/cassandra-1501972327-pid24937.hprof 25025
                   ├─ 9519 jmap -dump:format=b,file=/srv/cassandra-b/cassandra-1501972327-pid24937.hprof 25025
                   ├─ 9520 jmap -dump:format=b,file=/srv/cassandra-b/cassandra-1501972327-pid24937.hprof 25025
                   ├─ 9733 jmap -dump:format=b,file=/srv/cassandra-b/cassandra-1501972327-pid24937.hprof 25025
                   ├─ 9735 jmap -dump:format=b,file=/srv/cassandra-b/cassandra-1501972327-pid24937.hprof 25025
                   ├─ 9736 jmap -dump:format=b,file=/srv/cassandra-b/cassandra-1501972327-pid24937.hprof 25025
                   ├─14835 jmap -dump:format=b,file=/srv/cassandra-b/cassandra-1501972327-pid24937.hprof 25025
                   ├─14836 jmap -dump:format=b,file=/srv/cassandra-b/cassandra-1501972327-pid24937.hprof 25025
                   ├─14837 jmap -dump:format=b,file=/srv/cassandra-b/cassandra-1501972327-pid24937.hprof 25025
                   ├─14839 jmap -dump:format=b,file=/srv/cassandra-b/cassandra-1501972327-pid24937.hprof 25025
                   ├─14841 jmap -dump:format=b,file=/srv/cassandra-b/cassandra-1501972327-pid24937.hprof 25025
                   ├─14844 jmap -dump:format=b,file=/srv/cassandra-b/cassandra-1501972327-pid24937.hprof 25025
                   ├─18932 jmap -dump:format=b,file=/srv/cassandra-b/cassandra-1501972327-pid24937.hprof 25025
                   ├─18933 jmap -dump:format=b,file=/srv/cassandra-b/cassandra-1501972327-pid24937.hprof 25025
                   ├─18934 jmap -dump:format=b,file=/srv/cassandra-b/cassandra-1501972327-pid24937.hprof 25025
                   ├─18935 jmap -dump:format=b,file=/srv/cassandra-b/cassandra-1501972327-pid24937.hprof 25025
                   ├─18936 jmap -dump:format=b,file=/srv/cassandra-b/cassandra-1501972327-pid24937.hprof 25025
                   ├─18937 jmap -dump:format=b,file=/srv/cassandra-b/cassandra-1501972327-pid24937.hprof 25025
                   ├─18938 jmap -dump:format=b,file=/srv/cassandra-b/cassandra-1501972327-pid24937.hprof 25025
                   ├─18939 jmap -dump:format=b,file=/srv/cassandra-b/cassandra-1501972327-pid24937.hprof 25025
                   ├─18940 jmap -dump:format=b,file=/srv/cassandra-b/cassandra-1501972327-pid24937.hprof 25025
                   ├─18942 jmap -dump:format=b,file=/srv/cassandra-b/cassandra-1501972327-pid24937.hprof 25025
                   ├─18943 jmap -dump:format=b,file=/srv/cassandra-b/cassandra-1501972327-pid24937.hprof 25025
                   ├─18944 jmap -dump:format=b,file=/srv/cassandra-b/cassandra-1501972327-pid24937.hprof 25025
                   ├─18945 jmap -dump:format=b,file=/srv/cassandra-b/cassandra-1501972327-pid24937.hprof 25025
                   [ ... ]
        

        IMO, the sanest strategy here would be to leave the creation of heap dumps to the JVM.

        Show
        urandom Eric Evans added a comment - I think this behavior (invoking jmap on OOM) is a pretty serious violation to the Element of Least-surprise. We already provide mechanisms for passing arguments to the JVM, and TTBMK, all of them provide some means for dropping a heap dump on out-of-memory. It definitely caught me be surprise. We carried over -XX:+HeapDumpOnOutOfMemoryError from our 2.2.x environment, only to have Cassandra and the JVM racing to create a dump of the same name. Additionally, something about all of this is buggy, because on more than one occasion we've had Cassandra fork-bombing jmap processes ● cassandra-b.service - distributed storage system for structured data Loaded: loaded (/lib/systemd/system/cassandra-b.service; static) Active: active (running) since Sat 2017-08-05 22:32:07 UTC; 23h ago Main PID: 25025 (java) CGroup: /system.slice/cassandra-b.service ├─ 9213 jmap -histo 25025 ├─ 9214 jmap -histo 25025 ├─ 9284 jmap -dump:format=b,file=/srv/cassandra-b/cassandra-1501972327-pid24937.hprof 25025 ├─ 9285 jmap -dump:format=b,file=/srv/cassandra-b/cassandra-1501972327-pid24937.hprof 25025 ├─ 9388 jmap -dump:format=b,file=/srv/cassandra-b/cassandra-1501972327-pid24937.hprof 25025 ├─ 9453 jmap -dump:format=b,file=/srv/cassandra-b/cassandra-1501972327-pid24937.hprof 25025 ├─ 9519 jmap -dump:format=b,file=/srv/cassandra-b/cassandra-1501972327-pid24937.hprof 25025 ├─ 9520 jmap -dump:format=b,file=/srv/cassandra-b/cassandra-1501972327-pid24937.hprof 25025 ├─ 9733 jmap -dump:format=b,file=/srv/cassandra-b/cassandra-1501972327-pid24937.hprof 25025 ├─ 9735 jmap -dump:format=b,file=/srv/cassandra-b/cassandra-1501972327-pid24937.hprof 25025 ├─ 9736 jmap -dump:format=b,file=/srv/cassandra-b/cassandra-1501972327-pid24937.hprof 25025 ├─14835 jmap -dump:format=b,file=/srv/cassandra-b/cassandra-1501972327-pid24937.hprof 25025 ├─14836 jmap -dump:format=b,file=/srv/cassandra-b/cassandra-1501972327-pid24937.hprof 25025 ├─14837 jmap -dump:format=b,file=/srv/cassandra-b/cassandra-1501972327-pid24937.hprof 25025 ├─14839 jmap -dump:format=b,file=/srv/cassandra-b/cassandra-1501972327-pid24937.hprof 25025 ├─14841 jmap -dump:format=b,file=/srv/cassandra-b/cassandra-1501972327-pid24937.hprof 25025 ├─14844 jmap -dump:format=b,file=/srv/cassandra-b/cassandra-1501972327-pid24937.hprof 25025 ├─18932 jmap -dump:format=b,file=/srv/cassandra-b/cassandra-1501972327-pid24937.hprof 25025 ├─18933 jmap -dump:format=b,file=/srv/cassandra-b/cassandra-1501972327-pid24937.hprof 25025 ├─18934 jmap -dump:format=b,file=/srv/cassandra-b/cassandra-1501972327-pid24937.hprof 25025 ├─18935 jmap -dump:format=b,file=/srv/cassandra-b/cassandra-1501972327-pid24937.hprof 25025 ├─18936 jmap -dump:format=b,file=/srv/cassandra-b/cassandra-1501972327-pid24937.hprof 25025 ├─18937 jmap -dump:format=b,file=/srv/cassandra-b/cassandra-1501972327-pid24937.hprof 25025 ├─18938 jmap -dump:format=b,file=/srv/cassandra-b/cassandra-1501972327-pid24937.hprof 25025 ├─18939 jmap -dump:format=b,file=/srv/cassandra-b/cassandra-1501972327-pid24937.hprof 25025 ├─18940 jmap -dump:format=b,file=/srv/cassandra-b/cassandra-1501972327-pid24937.hprof 25025 ├─18942 jmap -dump:format=b,file=/srv/cassandra-b/cassandra-1501972327-pid24937.hprof 25025 ├─18943 jmap -dump:format=b,file=/srv/cassandra-b/cassandra-1501972327-pid24937.hprof 25025 ├─18944 jmap -dump:format=b,file=/srv/cassandra-b/cassandra-1501972327-pid24937.hprof 25025 ├─18945 jmap -dump:format=b,file=/srv/cassandra-b/cassandra-1501972327-pid24937.hprof 25025 [ ... ] IMO, the sanest strategy here would be to leave the creation of heap dumps to the JVM.
        Hide
        blerer Benjamin Lerer added a comment -

        Eric Evans Sorry, this ticket felt out of my radar.

        IMO, the sanest strategy here would be to leave the creation of heap dumps to the JVM.

        I fully agree with you.
        Initially, we started to catch OOM errors to prevent C* to run in an unknown state that could cause data corruption (see CASSANDRA-7507). The problem with that approach was that Heap dumps were not created anymore on OOM errors. I tried to fix that in CASSANDRA-9861 but that approach seems to have some serious issues and limitations.

        In April 2016, roughly at the same time that I was fixing CASSANDRA-9861, Oracle released the JDK 8u92 which added 2 new JVM Options: ExitOnOutOfMemoryError and CrashOnOutOfMemoryError.

        With that in mind, my idea would be to:

        1. stop handling the OOM errors on the C* side (and producing Heap dump)
        2. add ExitOnOutOfMemoryError to the default JVM options in the startup scripts
        3. in the News.txt upgrade procedure: request people to use a java version >= 8u92

        Joshua McKenzie you worked on CASSANDRA-7507. Do you have any concern with the approach I am suggesting?

        My only concern right now is that I know that some C* users are using Zing and I do not if it support the ExitOnOutOfMemoryError option.
        I asked the Zing support and will update the ticket as soon as I know more.

        Show
        blerer Benjamin Lerer added a comment - Eric Evans Sorry, this ticket felt out of my radar. IMO, the sanest strategy here would be to leave the creation of heap dumps to the JVM. I fully agree with you. Initially, we started to catch OOM errors to prevent C* to run in an unknown state that could cause data corruption (see CASSANDRA-7507 ). The problem with that approach was that Heap dumps were not created anymore on OOM errors. I tried to fix that in CASSANDRA-9861 but that approach seems to have some serious issues and limitations. In April 2016, roughly at the same time that I was fixing CASSANDRA-9861 , Oracle released the JDK 8u92 which added 2 new JVM Options: ExitOnOutOfMemoryError and CrashOnOutOfMemoryError . With that in mind, my idea would be to: stop handling the OOM errors on the C* side (and producing Heap dump) add ExitOnOutOfMemoryError to the default JVM options in the startup scripts in the News.txt upgrade procedure: request people to use a java version >= 8u92 Joshua McKenzie you worked on CASSANDRA-7507 . Do you have any concern with the approach I am suggesting? My only concern right now is that I know that some C* users are using Zing and I do not if it support the ExitOnOutOfMemoryError option. I asked the Zing support and will update the ticket as soon as I know more.
        Hide
        JoshuaMcKenzie Joshua McKenzie added a comment -

        Joshua McKenzie you worked on CASSANDRA-7507. Do you have any concern with the approach I am suggesting?

        No major concerns; maybe we should have a commented out option in the config to add CrashOnOutOfMemoryError for operators that would prefer to use that?

        in the News.txt upgrade procedure: request people to use a java version >= 8u92

        We can also enforce that or warn when we version check in the startup scripts.

        Show
        JoshuaMcKenzie Joshua McKenzie added a comment - Joshua McKenzie you worked on CASSANDRA-7507 . Do you have any concern with the approach I am suggesting? No major concerns; maybe we should have a commented out option in the config to add CrashOnOutOfMemoryError for operators that would prefer to use that? in the News.txt upgrade procedure: request people to use a java version >= 8u92 We can also enforce that or warn when we version check in the startup scripts.
        Hide
        ajorgensen Andrew Jorgensen added a comment - - edited

        Just to ask a clarifying question for my own understanding. This change would change the default from always producing a heap dump to having to explicitly turn it on which I think is good. One place this might trip some people up is cassandra has a command line argument that allows setting the specific location of the heap dump. One place that I know this is used is in the debian init script. Presumably if that flag is set the user wants to get heap dumps, is it worth also checking to see if -XX:HeapDumpPath or -XX:+HeapDumpOnOutOfMemoryError are present when creating a heap dump to make sure the behavior stays consistent?

        I also have another change CASSANDRA-13843 that I am proposing to fix the debian init script since it shadows the HeapDumpPath environment variable and renders it unchangeable from the default in its current form.

        Show
        ajorgensen Andrew Jorgensen added a comment - - edited Just to ask a clarifying question for my own understanding. This change would change the default from always producing a heap dump to having to explicitly turn it on which I think is good. One place this might trip some people up is cassandra has a command line argument that allows setting the specific location of the heap dump. One place that I know this is used is in the debian init script. Presumably if that flag is set the user wants to get heap dumps, is it worth also checking to see if -XX:HeapDumpPath or -XX:+HeapDumpOnOutOfMemoryError are present when creating a heap dump to make sure the behavior stays consistent? I also have another change CASSANDRA-13843 that I am proposing to fix the debian init script since it shadows the HeapDumpPath environment variable and renders it unchangeable from the default in its current form.

          People

          • Assignee:
            blerer Benjamin Lerer
            Reporter:
            anmolsharma.141 anmols
          • Votes:
            0 Vote for this issue
            Watchers:
            15 Start watching this issue

            Dates

            • Created:
              Updated:

              Development