Uploaded image for project: 'Kudu'
  1. Kudu
  2. KUDU-152

Log GC race between Append() and Apply()



    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • M3
    • None
    • log
    • None


      Currently in the OpId anchoring Log GC code, there is a race such that through a series of particular events we may delete WAL files before they are persisted into the in-memory data structures, where they will normally be anchored until flush to disk.

      This is the series of events that may occur:

      1. Start with a quiesced tablet (no outstanding unflushed data, no log anchors)
      2. Log OpId 1 (add OpId 1 to Prepare queue)
      3. Log OpId 2, which causes a roll of OpId 1's log segment due to full WAL file
      4. Log::GC() thread runs, which sees no anchor on the rolled segment. Delete OpId 1 log segment
      5. Prepare for OpId 1 finally completes
      6. Apply for OpId 1 finally completes. At this time OpId 1 is anchored in the mem store, however the data is gone.

      Due to this race, we need a way to anchor on the OpId as soon as the OpId is assigned inside of Append(), and that anchor must be maintained until Apply() completes.

      There are other ideas involving the use of MVCC snapshots to detect such problems, however that only reduces the race window, because in-flight timestamps are not tracked until the Prepare() phase is complete, which is too late in the above scenario. This may further be improved by making mem stores inherit the anchor from their predecessor until some criteria is met (either time-based or some other event), however at this time we don't have consensus on a criteria that will guarantee 100% safety from data loss.


        Issue Links



              mpercy Mike Percy
              mpercy Mike Percy
              0 Vote for this issue
              1 Start watching this issue