Uploaded image for project: 'Geode'
  1. Geode
  2. GEODE-3161 Improvements to backups
  3. GEODE-2654

Backups can capture different members from different points in time

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 1.3.0
    • persistence

    Description

      Geode backups should behave the same as recovering from disk after killing all of the members.

      Unfortunately, the backups instead can backup data on different members at different points in time, resulting in application level inconsistency. Here's an example of what goes wrong:

      1. Start a Backup
      2. Do a put in region A
      3. Do a put in region B
      4. The backup finishes
      5. Recover from the backup
      6. You may see the put to region B, but not A, even if the data is colocated.

      We ran into this with with lucene indexes - see GEODE-2643. We've worked around GEODE-2643 by putting all data into the same region, but we're worried that we still have a problem with the async event queue. With an async event listener that writes to another geode region, because it's possible to recover different points in time for the async event queue and the region, resulting in missed events.

      The issue is that there is no locking or other mechanism to prevent different members from backing up their data at different points in time. Colocating data does not avoid this problem, because when we recover from disk we may recover region A's bucket from one member and region B's bucket from another member.

      The backup operation does have a mechanism for making sure that it gets a point in time snapshot of metadata. It sends a PrepareBackupRequest to all members which causes them to lock their init file. Then it sends a FinishBackupRequest which tells all members to backup their data and release the lock. This ensures that a backup doesn't completely miss a bucket or get corrupt metadata about what members host as bucket. See the comments in DiskStoreImpl.lockStoreBeforeBackup.

      We should extend this Prepare/Finish mechanism to make sure we get a point in time snapshot of region data as well. One way to do this would be to get a lock on the oplog in lockStoreBeforeBackup to prevent writes and hold it until releaseBackupLock is called.

      Attachments

        Activity

          People

            lgallinat Lynn Gallinat
            upthewaterspout Dan Smith
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: