Uploaded image for project: 'Kudu'
  1. Kudu
  2. KUDU-2102

PosixRWFile::Sync doesn't guarantee durability when used with multiple threads



    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.5.0
    • 1.5.0
    • util
    • None


      PosixRWFile uses an AtomicBool to "optimize" calls to Sync(). It works as follows:

      1. The bool starts as false.
      2. It is set to true in WriteV().
      3. In Sync(), we CAS it from true to false. If the CAS succeeds, we actually do the fsync().

      The idea is that if two threads call Sync() at the same time, only one will actually do the fsync(). However, there's a problem here: the "losing" thread returns from Sync() early and operates under the assumption that the file's data has been made durable, even though it is still in the process of being synchronized to disk.

      We have two options:

      1. Preserve the optimization but fix it so that the losing thread(s) wait for the "winning" thread to finish the fsync. This can be done with some more synchronization primitives (off the top of my head: a lock, a condition variable, and another boolean).
      2. Remove the optimization and let the losing thread(s) perform additional fsyncs.

      To measure the effect of the optimization, I wrote a test program that opens a file and fsyncs it 1000 times. I ran it on an el6.6 box, on spinning disks mounted as xfs and ext4, and on files that were empty and 10G in size (dropping caches first). I measured the cost of an fsync to be around 200 microseconds, suggesting that no I/O is being performed and that the overhead is purely syscall-related.




            hahao Hao Hao
            adar Adar Dembo
            0 Vote for this issue
            1 Start watching this issue