Uploaded image for project: 'Apache Cassandra'
  1. Apache Cassandra
  2. CASSANDRA-15191

stop_paranoid disk failure policy is ignored on CorruptSSTableException after node is up

    XMLWordPrintableJSON

Details

    Description

      There is a bug when disk_failure_policy is set to stop_paranoid and CorruptSSTableException is thrown after server is up. The problem is that this setting is ignored. Normally, it should stop gossip and transport but it just continues to serve requests and an exception is just logged.

       

      This patch unifies the exception handling in JVMStabilityInspector and code is reworked in such way that this inspector acts as a central place where such exceptions are inspected. 

       

      The core reason for ignoring that exception is that thrown exception in AbstractLocalAwareExecturorService is not CorruptSSTableException but it is RuntimeException and that exception is as its cause. Hence it is better if we handle this in JVMStabilityInspector which can recursively examine it, hence act accordingly.

      Behaviour before:

      stop_paranoid of disk_failure_policy is ignored when CorruptSSTableException is thrown, e.g. on a regular select statement

      Behaviour after:

      Gossip and transport (cql) is turned off, JVM is still up for further investigation e.g. by jmx.

      Attachments

        1. log.txt
          34 kB
          Vincent White

        Issue Links

          Activity

            People

              stefan.miklosovic Stefan Miklosovic
              VincentWhite Vincent White
              Stefan Miklosovic
              Brandon Williams, David Capwell
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 4h
                  4h