Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-10900

FULL table backup and restore



    • Task
    • Status: Resolved
    • Major
    • Resolution: Done
    • None
    • None
    • None
    • None


      Feature Description

      This is a subtask of HBase-7912 to support FULL backup/restore, and will complete the following function:

      Backup Restore example
      /* backup from sourcecluster to targetcluster                                  */
      /* if no table name specified, all tables from source cluster will be backuped */
      [sourcecluster]$ hbase backup create full hdfs://hostname.targetcluster.org:9000/userid/backupdir t1_dn,t2_dn,t3_dn
      /* restore on targetcluser, this is a local restore                                             */
      /* backup_1396650096738 - backup image name                                                     */
      /* t1_dn,etc are the original table names. All tables will be restored if not specified         */
      /* t1_dn_restore, etc. are the restored table. if not specified, orginal table name will be used*/
      [targetcluster]$ hbase restore /userid/backupdir backup_1396650096738 t1_dn,t2_dn,t3_dn t1_dn_restore,t2_dn_restore,t3_dn_restore
      /* restore from targetcluster back to source cluster, this is a remote restore
      [sourcecluster]$ hbase restore hdfs://hostname.targetcluster.org:9000/userid/backupdir backup_1396650096738 t1_dn,t2_dn,t3_dn t1_dn_restore,t2_dn_restore,t3_dn_restore

      Detail layout and frame work for the next jiras

      The patch is a wrapper of the existing snapshot and exportSnapshot, and will use as the base framework for the over-all solution of HBase-7912 as described below:

      • bin/hbase : end-user command line interface to invoke BackupClient and RestoreClient
      • BackupClient.java : 'main' entry for backup operations. This patch will only support 'full' backup. In future jiras, will support:
        • create incremental backup
        • cancel an ongoing backup
        • delete an exisitng backup image
        • describe the detailed informaiton of backup image
        • show history of all successful backups
        • show the status of the latest backup request
        • convert incremental backup WAL files into HFiles. either on-the-fly during create or after create
        • merge backup image
        • stop backup a table of existing backup image
        • show tables of a backup image
      • BackupCommands.java : a place to keep all the command usages and options
      • BackupManager.java : handle backup requests on server-side, create BACKUP ZOOKEEPER nodes to keep track backup. The timestamps kept in zookeeper will be used for future incremental backup (not included in this jira). Create BackupContext and DispatchRequest.
      • BackupHandler.java : in this patch, it is a wrapper of snapshot and exportsnapshot. In future jiras,
        • timestamps info will be recorded in ZK
        • carry on incremental backup.
        • update backup progress
        • set flags of status
        • build up backupManifest file(in this jira only limited info for fullback. later on, timestamps and dependency of multipl backup images are also recorded here)
        • clean up after failed backup
        • clean up after cancelled backup
        • allow on-the-fly convert during incremental backup
      • BackupContext.java : encapsulate backup information like backup ID, table names, directory info, phase, TimeStamps of backup progress, size of data, ancestor info, etc.
      • BackupCopier.java : the copying operation. Later on, to support progress report and mapper estimation; and extends DisCp for progress updating to ZK during backup.
      • BackupExcpetion.java: to handle exception from backup/restore
      • BackupManifest.java : encapsulate all the backup image information. The manifest info will be bundled as manifest file together with data. So that each backup image will contain all the info needed for restore.
      • BackupStatus.java : encapsulate backup status at table level during backup progress
      • BackupUtil.java : utility methods during backup process
      • RestoreClient.java : 'main' entry for restore operations. This patch will only support 'full' backup.
      • RestoreUtil.java : utility methods during restore process
      • ExportSnapshot.java : remove 'final' so that another class SnapshotCopy.java can extends from it
      • SnapshotCopy.java : only a wrapper at this moment. But will be extended to keep track progress(maybe should implemented in ExportSnapshot directly?)
      • BackupRestoreConstants.java : add the constants used by backup/restore code.
      • HBackupFilesystem.java : the filesystem related api used by BackupClient and RestoreClient.

      Global log roll

      currently a customized one under org.apache.hadoop.hbase.backup.master and org.apache.hadoop.hbase.backup.regionserver
      HBASE-11148 is opened to provide a general 'global log roll', and fullbackup code will be modified to use the general 'global log roll' later once HBase-11148 is accepted by the community.


      • currently, the code is under hbase-sever because it already contain a package name called 'backup'. If move to hbase-client, the pom file has to be updated to include more dependency
      • currently invoke through script bin/hbase as CLI interface. One advantage is easy to embed into a linux sh script


        1. HBASE-10900-trunk-v4.patch
          217 kB
          Demai Ni
        2. HBASE-10900-trunk-v3.patch
          160 kB
          Demai Ni
        3. HBASE-10900-trunk-v2.patch
          160 kB
          Demai Ni
        4. HBASE-10900-fullbackup-trunk-v1.patch
          151 kB
          Demai Ni

        Issue Links



              jinghe Jerry He
              nidmhbase Demai Ni
              0 Vote for this issue
              15 Start watching this issue