Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-6756

Provide option to avoid loading orphan SSTables on startup

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Normal
    • Resolution: Won't Fix
    • Fix Version/s: 1.2.17
    • Component/s: None
    • Labels:
      None

      Description

      When Cassandra starts up, it enumerates all SSTables on disk for a known column family and proceeds to loading all of them, even those that were left behind before the restart because of a problem of some sort. This can lead to "data gain" (resurrected data) which is just as bad as data loss.

      The ask is to provide a yaml config option which would allow one to turn that behavior off by default so a cassandra cluster would be immune to data gain when nodes get restarted (at least with Leveled where Cassandra keeps track of SSTables).

      This is sort of a follow-up to CASSANDRA-6503 (fixed in 1.2.14). We're just extremely nervous that orphan SSTables could appear because of some other potential problem somewhere else and cause zombie data on a random reboot.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                vmallet Vincent Mallet
              • Votes:
                1 Vote for this issue
                Watchers:
                8 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: