Uploaded image for project: 'Apache Ozone'
  1. Apache Ozone
  2. HDDS-6517 Snapshot support for Ozone
  3. HDDS-8390

Synchronization between Snapshot Deletes/GC and other Snapshot jobs (read/diff)

    XMLWordPrintableJSON

Details

    Description

      We need to have proper synchronization between Snapshot delete/GC and other Snapshot jobs e.g. reads from Snapshots and Snapdiff.  Snapdiff is particularly important case since it could be a long running job and in the middle of the job, Snapshot delete/GC can kick in. 

      We should also have a uniform behavior in the cluster in case of a failover and concurrent Snap-diff/Deletes. It should not happen that a leader OM node returns certain result to a client but after a failover the new OM leader returns different result.

      Thus, in order to prevent client from getting partial SnapDiff result without the client even realizing it, and to avoid explicitly holding lock, we would want to use an approach similar to optimistic locking, by checking whether the snapshot is still ACTIVE towards the end of the request lifetime when SnapDiff service has already collected all the batch entires in a buffer. See the attachment for a timeline of potential race condition: 35fdc3bd-cd0c-40f3-8fd7-2d8a8dc4643d.pdf

      Attachments

        Issue Links

          Activity

            People

              hemantk Hemant Kumar
              ppogde Prashant Pogde
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: