Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-6756

Provide option to avoid loading orphan SSTables on startup

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Normal
    • Resolution: Won't Fix
    • 1.2.17
    • None
    • None

    Description

      When Cassandra starts up, it enumerates all SSTables on disk for a known column family and proceeds to loading all of them, even those that were left behind before the restart because of a problem of some sort. This can lead to "data gain" (resurrected data) which is just as bad as data loss.

      The ask is to provide a yaml config option which would allow one to turn that behavior off by default so a cassandra cluster would be immune to data gain when nodes get restarted (at least with Leveled where Cassandra keeps track of SSTables).

      This is sort of a follow-up to CASSANDRA-6503 (fixed in 1.2.14). We're just extremely nervous that orphan SSTables could appear because of some other potential problem somewhere else and cause zombie data on a random reboot.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              vmallet Vincent Mallet
              Votes:
              1 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: