Uploaded image for project: 'Apache Cassandra'
  1. Apache Cassandra
  2. CASSANDRA-16619

Loss of commit log data possible after sstable ingest

    XMLWordPrintableJSON

Details

    Description

      SSTable metadata contains commit log positions of the sstable. These positions are used to filter out mutations from the commit log on restart and only make sense for the node on which the data was flushed.

      If an SSTable is moved between nodes they may cover regions that the receiving node has not yet flushed, and result in valid data being lost should these sections of the commit log need to be replayed.

      Solution:
      The chosen solution introduces a new sstable metadata (StatsMetadata) - originatingHostId (UUID), which is the local host id of the node on which the sstable was created, or null if not known. Commit log intervals from an sstable are taken into account during Commit Log replay only when the originatingHostId of the sstable matches the local node's hostId.

      For new sstables the originatingHostId is set according to StorageService's local hostId.
      For compacted sstables the originatingHostId set according to StorageService's local hostId, and only commit log intervals from local sstables is preserved in the resulting sstable.

      discovered by jakubzytka

      Attachments

        Issue Links

          Activity

            People

              jlewandowski Jacek Lewandowski
              jlewandowski Jacek Lewandowski
              Jacek Lewandowski
              Benjamin Lerer, Branimir Lambov
              Votes:
              0 Vote for this issue
              Watchers:
              13 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 2h
                  2h