Uploaded image for project: 'Apache Cassandra'
  1. Apache Cassandra
  2. CASSANDRA-6245

"nodetool refresh" design is unsafe

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Low
    • Resolution: Duplicate
    • None
    • None
    • None
    • Low

    Description

      CASSANDRA-2991 added a "nodetool refresh" feature by which Cassandra is able to discover non-live SSTables in the datadir and make them live.

      It does this by :

      1) looking for SSTable files in the data dir
      2) renaming SSTables it finds into the current SSTable id sequence

      This implementation is exposed to a race with a chance of silent data loss.

      1) Node's SSTable id sequence is on sstable #2, the next table to flush will get "2" as its numeric part
      2) Copy SSTable with "2" as its numeric part into data dir
      3) nodetool flush
      4) notice that your "2" SSTable has been silently overwritten by a just-flushed "2" SSTable
      5) nodetool refresh would still succeed, but would now be a no-op

      A simple solution would be to create a subdirectory of the datadir called "refresh/" to serve as the location to refresh from.

      Alternately/additionally, there is probably not really a compelling reason for Cassandra to completely ignore existing files at write time.. a check for existing files at a given index and inflating the index to avoid overwriting them them seems trivial and inexpensive. I will gladly file a JIRA for this change in isolation if there is interest.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              rcoli Robert Coli
              Votes:
              1 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: