Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-14142

HBase Backup/Restore Phase 3: Edits deduplication during backup

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Closed
    • Major
    • Resolution: Invalid
    • None
    • HBASE-7912
    • None
    • None

    Description

      As since we do not record last backed up sequence ids (MVCC) and do not restore up to that sequence id - that is kind of tricky, there will be some duplicates of KVs in store files after first incremental restore after full backup. These duplicates are result of how we do full backup and first incremental backup after full one. During full backup we perform distributed log roll and record, for every RS, last WAL timestamp, then we do snapshot. The next WAL after recorded one will make it into a next incremental backup set, but it will contains some edits (puts, deletes) which have been recorded by a previous snapshot. During restore, we, first, restore snapshot, then we will re-play WALs and this operation can create some duplicates of KVs in different store files.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              vrodionov Vladimir Rodionov
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: