Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-15227 HBase Backup Phase 3: Fault tolerance (client/server) support
  3. HBASE-17852

Add Fault tolerance to HBASE-14417 (Support bulk loaded files in incremental backup)

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 3.0.0-alpha-1
    • None
    • None
    • Reviewed
    • Something like 2PC must be implemented when record list of files to backup system table, rollback in case of a failure and cleanup of a failed files.

    Description

      Design approach rollback-via-snapshot implemented in this ticket:

      1. Before backup create/delete/merge starts we take a snapshot of the backup meta-table (backup system table). This procedure is lightweight because meta table is small, usually should fit a single region.
      2. When operation fails on a server side, we handle this failure by cleaning up partial data in backup destination, followed by restoring backup meta-table from a snapshot.
      3. When operation fails on a client side (abnormal termination, for example), next time user will try create/merge/delete he(she) will see error message, that system is in inconsistent state and repair is required, he(she) will need to run backup repair tool.
      4. To avoid multiple writers to the backup system table (backup client and BackupObserver's) we introduce small table ONLY to keep listing of bulk loaded files. All backup observers will work only with this new tables. The reason: in case of a failure during backup create/delete/merge/restore, when system performs automatic rollback, some data written by backup observers during failed operation may be lost. This is what we try to avoid.
      5. Second table keeps only bulk load related references. We do not care about consistency of this table, because bulk load is idempotent operation and can be repeated after failure. Partially written data in second table does not affect on BackupHFileCleaner plugin, because this data (list of bulk loaded files) correspond to a files which have not been loaded yet successfully and, hence - are not visible to the system

      Attachments

        1. screenshot-1.png
          152 kB
          Apekshit Sharma
        2. HBASE-17852-v10.patch
          86 kB
          Vladimir Rodionov

        Issue Links

          Activity

            People

              vrodionov Vladimir Rodionov
              vrodionov Vladimir Rodionov
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: