Uploaded image for project: 'Ignite'
  1. Ignite
  2. IGNITE-5772

Race between WAL segment rollover and concurrent log

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.1
    • 2.3
    • cache
    • None

    Description

      The WAL log() and close() are synch-ed as follows:
      log: read head, check stop flag, cas head
      close: set stop flag, cas head to fake record.
      This guarantees that after close() is called, there will be no other records appended to the closed segment.
      Now consider three threads doing the following operations:
      T1: flush(); T2: rollOver(); T3: log();
      The sequence of events:
      1) T1 does a CAS of head to FakeRecord
      2) T3 reads head as FakeRecord, reads stop flag as false
      3) T2 attempts to rollOver: CAS stop to true; call flushOrWait(null); call flush(null); Since the head is an instance of FakeRecord, the flush(null) immediately returns false. This thread waits for written bytes and proceeds
      4) T3 successfully does a CAS of head to non-fake record
      5) T2 proceeds with rollOver, signals next available and asserts on head.
      The invariant above is broken when T2 does not CAS fake record during rollover, which allows T3 to append an entry to the closed segment. The solution is to change the code so the CAS is always attempted on close even if the current head is already a FakeRecord.
      Alternatively, we can introduce another type of fake record that will seal the WAL segment queue.

      Attachments

        Issue Links

          Activity

            People

              agoncharuk Alexey Goncharuk
              agoncharuk Alexey Goncharuk
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: