Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-19568

Restore of HBase table using incremental backup doesn't restore rows from an earlier incremental backup

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 3.0.0-alpha-1
    • None
    • None
    • Reviewed

    Description

      Credits to romil.choksi

      Restore of bulk-loaded HBase table doesn't restore deleted rows
      Steps:
      Create usertable and insert a few rows in it
      Create full backup
      Bulk load into usertable, and create first incremental backup
      Bulk load into usertable again, and create second incremental backup
      Delete row each from initial insert, first bulk load and second bulk load
      Restore usertable using second incremental backup
      Verify if each of the deleted rows has been restored
      On restore using second incremental backup id, the test failed as all of the rows from first bulk load were not available. Data from initial insertion (full backup) and second bulk load were only available.

      Attachments

        1. HBASE-19568-v1.patch
          111 kB
          Vladimir Rodionov
        2. HBASE-19568-v2.patch
          36 kB
          Vladimir Rodionov
        3. HBASE-19568-v3.patch
          33 kB
          Vladimir Rodionov
        4. HBASE-19568-v4.patch
          34 kB
          Vladimir Rodionov

        Issue Links

          Activity

            Patch v1. cc: tedyu@apache.org, elserj

            vrodionov Vladimir Rodionov added a comment - Patch v1. cc: tedyu@apache.org , elserj
            yuzhihong@gmail.com Ted Yu added a comment -
            116	    // #4 balk load again
            

            Correct spelling mistake.

            Is the fix mostly done in copyBulkLoadedFiles() ?

            yuzhihong@gmail.com Ted Yu added a comment - 116 // #4 balk load again Correct spelling mistake. Is the fix mostly done in copyBulkLoadedFiles() ?
            elserj Josh Elser added a comment -
            +      LOG.debug("Added region procedure manager: " + regionProcedureClass+"\nAdded region observer: " +
            +          regionObserverClass);
            

            Can you change the newline to a ". " (period and space) just to keep this all on one line (helps in grepping/parsing logs)

            +  private void updateFileLists(List<String> activeFiles, List<String> archiveFiles)
            +      throws IOException {
            +    FileSystem fs = FileSystem.get(conf);
            +    List<String> newlyArchived = new ArrayList<String>();
            

            Should a FileSystem instance be a member to IncrementalTableBackupClient? Or, pass it in as an argument from copyBulkLoadedFiles(..).

            +  private List<Path> getFilesRecursively(String fileBackupDir) throws IllegalArgumentException,
            +    IOException
            +  {
            +    FileSystem fs = FileSystem.get((new Path(fileBackupDir)).toUri(),
            +      new Configuration());
            

            How about passing in the Configuration instead of creating a new one? You have one in restoreImages already.

            +    RemoteIterator<LocatedFileStatus> it = fs.listFiles(new Path(fileBackupDir), true);
            +    while (it.hasNext()) {
            +      Path p = it.next().getPath();
            +      if (HFile.isHFileFormat(fs, p)) {
            +        list.add(p);
            +      }
            +    }
            

            Suggest FileSystem.listStatus(Path, PathFilter) might be a bit more "terse".

            Nice test addition to catch this!

            Is the fix mostly done in copyBulkLoadedFiles() ?

            Agreed – any chance I can convince you to move some of the cleanup-related things (looks mostly like the improvements from the stalled/blocked FT patch from others)? This seems like a really simple bug to fix, but it gets lost in the renames. Would be happy to commit the cleanup stuff immediately (given that was already reviewed).

            elserj Josh Elser added a comment - + LOG.debug( "Added region procedure manager: " + regionProcedureClass+ "\nAdded region observer: " + + regionObserverClass); Can you change the newline to a ". " (period and space) just to keep this all on one line (helps in grepping/parsing logs) + private void updateFileLists(List< String > activeFiles, List< String > archiveFiles) + throws IOException { + FileSystem fs = FileSystem.get(conf); + List< String > newlyArchived = new ArrayList< String >(); Should a FileSystem instance be a member to IncrementalTableBackupClient? Or, pass it in as an argument from copyBulkLoadedFiles(..) . + private List<Path> getFilesRecursively( String fileBackupDir) throws IllegalArgumentException, + IOException + { + FileSystem fs = FileSystem.get(( new Path(fileBackupDir)).toUri(), + new Configuration()); How about passing in the Configuration instead of creating a new one? You have one in restoreImages already. + RemoteIterator<LocatedFileStatus> it = fs.listFiles( new Path(fileBackupDir), true ); + while (it.hasNext()) { + Path p = it.next().getPath(); + if (HFile.isHFileFormat(fs, p)) { + list.add(p); + } + } Suggest FileSystem.listStatus(Path, PathFilter) might be a bit more "terse". Nice test addition to catch this! Is the fix mostly done in copyBulkLoadedFiles() ? Agreed – any chance I can convince you to move some of the cleanup-related things (looks mostly like the improvements from the stalled/blocked FT patch from others)? This seems like a really simple bug to fix, but it gets lost in the renames. Would be happy to commit the cleanup stuff immediately (given that was already reviewed).

            How about passing in the Configuration instead of creating a new one? You have one in restoreImages already.

            That is conf for local cluster, won't work.

            vrodionov Vladimir Rodionov added a comment - How about passing in the Configuration instead of creating a new one? You have one in restoreImages already. That is conf for local cluster, won't work.

            Patch v2. Reverted system table renaming back and did some minor fixes

            vrodionov Vladimir Rodionov added a comment - Patch v2. Reverted system table renaming back and did some minor fixes

            FileSystem.listStatus(Path, PathFilter)

            I tried this, but it lists only one directory level.

            vrodionov Vladimir Rodionov added a comment - FileSystem.listStatus(Path, PathFilter) I tried this, but it lists only one directory level.
            hadoopqa Hadoop QA added a comment -
            -1 overall



            Vote Subsystem Runtime Comment
            0 reexec 0m 0s Docker mode activated.
            -1 patch 0m 3s HBASE-19568 does not apply to master. Rebase required? Wrong Branch? See https://yetus.apache.org/documentation/0.6.0/precommit-patchnames for help.



            This message was automatically generated.

            hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 0s Docker mode activated. -1 patch 0m 3s HBASE-19568 does not apply to master. Rebase required? Wrong Branch? See https://yetus.apache.org/documentation/0.6.0/precommit-patchnames for help. Subsystem Report/Notes JIRA Issue HBASE-19568 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12904480/HBASE-19568-v2.patch Console output https://builds.apache.org/job/PreCommit-HBASE-Build/10892/console Powered by Apache Yetus 0.6.0 http://yetus.apache.org This message was automatically generated.
            elserj Josh Elser added a comment -

            I tried this, but it lists only one directory level.

            Ahh, yeah. That would make sense. What you have now is fine then, I think.

            elserj Josh Elser added a comment - I tried this, but it lists only one directory level. Ahh, yeah. That would make sense. What you have now is fine then, I think.
            elserj Josh Elser added a comment -

            v2 looks OK to me. Maybe Ted wants to take another look too?

            elserj Josh Elser added a comment - v2 looks OK to me. Maybe Ted wants to take another look too?
            yuzhihong@gmail.com Ted Yu added a comment -

            Can we get a clean QA run ?

            yuzhihong@gmail.com Ted Yu added a comment - Can we get a clean QA run ?

            Rebased the patch to the current master

            vrodionov Vladimir Rodionov added a comment - Rebased the patch to the current master
            hadoopqa Hadoop QA added a comment -
            -1 overall



            Vote Subsystem Runtime Comment
            0 reexec 0m 10s Docker mode activated.
                  Prechecks
            0 findbugs 0m 0s Findbugs executables are not available.
            +1 hbaseanti 0m 0s Patch does not have any anti-patterns.
            +1 @author 0m 0s The patch does not contain any @author tags.
            +1 test4tests 0m 0s The patch appears to include 2 new or modified test files.
                  master Compile Tests
            +1 mvninstall 4m 10s master passed
            +1 compile 0m 18s master passed
            +1 checkstyle 0m 12s master passed
            +1 shadedjars 4m 16s branch has no errors when building our shaded downstream artifacts.
            +1 javadoc 0m 11s master passed
                  Patch Compile Tests
            +1 mvninstall 4m 12s the patch passed
            +1 compile 0m 19s the patch passed
            +1 javac 0m 19s the patch passed
            -1 checkstyle 0m 12s hbase-backup: The patch generated 8 new + 72 unchanged - 10 fixed = 80 total (was 82)
            -1 whitespace 0m 0s The patch has 2 line(s) that end in whitespace. Use git apply --whitespace=fix <<patch_file>>. Refer https://git-scm.com/docs/git-apply
            +1 shadedjars 4m 7s patch has no errors when building our shaded downstream artifacts.
            +1 hadoopcheck 18m 16s Patch does not cause any errors with Hadoop 2.6.5 2.7.4 or 3.0.0.
            +1 javadoc 0m 12s the patch passed
                  Other Tests
            +1 unit 11m 41s hbase-backup in the patch passed.
            +1 asflicense 0m 10s The patch does not generate ASF License warnings.
            44m 16s



            Subsystem Report/Notes
            Docker Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:eee3b01
            JIRA Issue HBASE-19568
            JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12905540/HBASE-19568-v3.patch
            Optional Tests asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile
            uname Linux 878697030d20 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 13:48:03 UTC 2016 x86_64 GNU/Linux
            Build tool maven
            Personality /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
            git revision master / ee3accb370
            maven version: Apache Maven 3.5.2 (138edd61fd100ec658bfa2d307c43b76940a5d7d; 2017-10-18T07:58:13Z)
            Default Java 1.8.0_151
            checkstyle https://builds.apache.org/job/PreCommit-HBASE-Build/10991/artifact/patchprocess/diff-checkstyle-hbase-backup.txt
            whitespace https://builds.apache.org/job/PreCommit-HBASE-Build/10991/artifact/patchprocess/whitespace-eol.txt
            Test Results https://builds.apache.org/job/PreCommit-HBASE-Build/10991/testReport/
            modules C: hbase-backup U: hbase-backup
            Console output https://builds.apache.org/job/PreCommit-HBASE-Build/10991/console
            Powered by Apache Yetus 0.6.0 http://yetus.apache.org

            This message was automatically generated.

            hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 10s Docker mode activated.       Prechecks 0 findbugs 0m 0s Findbugs executables are not available. +1 hbaseanti 0m 0s Patch does not have any anti-patterns. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 2 new or modified test files.       master Compile Tests +1 mvninstall 4m 10s master passed +1 compile 0m 18s master passed +1 checkstyle 0m 12s master passed +1 shadedjars 4m 16s branch has no errors when building our shaded downstream artifacts. +1 javadoc 0m 11s master passed       Patch Compile Tests +1 mvninstall 4m 12s the patch passed +1 compile 0m 19s the patch passed +1 javac 0m 19s the patch passed -1 checkstyle 0m 12s hbase-backup: The patch generated 8 new + 72 unchanged - 10 fixed = 80 total (was 82) -1 whitespace 0m 0s The patch has 2 line(s) that end in whitespace. Use git apply --whitespace=fix <<patch_file>>. Refer https://git-scm.com/docs/git-apply +1 shadedjars 4m 7s patch has no errors when building our shaded downstream artifacts. +1 hadoopcheck 18m 16s Patch does not cause any errors with Hadoop 2.6.5 2.7.4 or 3.0.0. +1 javadoc 0m 12s the patch passed       Other Tests +1 unit 11m 41s hbase-backup in the patch passed. +1 asflicense 0m 10s The patch does not generate ASF License warnings. 44m 16s Subsystem Report/Notes Docker Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:eee3b01 JIRA Issue HBASE-19568 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12905540/HBASE-19568-v3.patch Optional Tests asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile uname Linux 878697030d20 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 13:48:03 UTC 2016 x86_64 GNU/Linux Build tool maven Personality /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh git revision master / ee3accb370 maven version: Apache Maven 3.5.2 (138edd61fd100ec658bfa2d307c43b76940a5d7d; 2017-10-18T07:58:13Z) Default Java 1.8.0_151 checkstyle https://builds.apache.org/job/PreCommit-HBASE-Build/10991/artifact/patchprocess/diff-checkstyle-hbase-backup.txt whitespace https://builds.apache.org/job/PreCommit-HBASE-Build/10991/artifact/patchprocess/whitespace-eol.txt Test Results https://builds.apache.org/job/PreCommit-HBASE-Build/10991/testReport/ modules C: hbase-backup U: hbase-backup Console output https://builds.apache.org/job/PreCommit-HBASE-Build/10991/console Powered by Apache Yetus 0.6.0 http://yetus.apache.org This message was automatically generated.

            Patch v4 addresses checkstyle and whitespaces

            vrodionov Vladimir Rodionov added a comment - Patch v4 addresses checkstyle and whitespaces
            hadoopqa Hadoop QA added a comment -
            -1 overall



            Vote Subsystem Runtime Comment
            0 reexec 0m 23s Docker mode activated.
                  Prechecks
            0 findbugs 0m 0s Findbugs executables are not available.
            +1 hbaseanti 0m 0s Patch does not have any anti-patterns.
            +1 @author 0m 0s The patch does not contain any @author tags.
            +1 test4tests 0m 0s The patch appears to include 2 new or modified test files.
                  master Compile Tests
            +1 mvninstall 4m 36s master passed
            +1 compile 0m 18s master passed
            +1 checkstyle 0m 13s master passed
            +1 shadedjars 4m 23s branch has no errors when building our shaded downstream artifacts.
            +1 javadoc 0m 13s master passed
                  Patch Compile Tests
            +1 mvninstall 4m 14s the patch passed
            +1 compile 0m 18s the patch passed
            +1 javac 0m 18s the patch passed
            -1 checkstyle 0m 14s hbase-backup: The patch generated 1 new + 71 unchanged - 11 fixed = 72 total (was 82)
            -1 whitespace 0m 0s The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix <<patch_file>>. Refer https://git-scm.com/docs/git-apply
            +1 shadedjars 4m 9s patch has no errors when building our shaded downstream artifacts.
            +1 hadoopcheck 18m 22s Patch does not cause any errors with Hadoop 2.6.5 2.7.4 or 3.0.0.
            +1 javadoc 0m 13s the patch passed
                  Other Tests
            +1 unit 16m 14s hbase-backup in the patch passed.
            +1 asflicense 0m 11s The patch does not generate ASF License warnings.
            49m 52s



            Subsystem Report/Notes
            Docker Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:eee3b01
            JIRA Issue HBASE-19568
            JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12905579/HBASE-19568-v4.patch
            Optional Tests asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile
            uname Linux 71e56e77da47 4.4.0-89-generic #112-Ubuntu SMP Mon Jul 31 19:38:41 UTC 2017 x86_64 GNU/Linux
            Build tool maven
            Personality /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
            git revision master / ee3accb370
            maven version: Apache Maven 3.5.2 (138edd61fd100ec658bfa2d307c43b76940a5d7d; 2017-10-18T07:58:13Z)
            Default Java 1.8.0_151
            checkstyle https://builds.apache.org/job/PreCommit-HBASE-Build/10999/artifact/patchprocess/diff-checkstyle-hbase-backup.txt
            whitespace https://builds.apache.org/job/PreCommit-HBASE-Build/10999/artifact/patchprocess/whitespace-eol.txt
            Test Results https://builds.apache.org/job/PreCommit-HBASE-Build/10999/testReport/
            modules C: hbase-backup U: hbase-backup
            Console output https://builds.apache.org/job/PreCommit-HBASE-Build/10999/console
            Powered by Apache Yetus 0.6.0 http://yetus.apache.org

            This message was automatically generated.

            hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 23s Docker mode activated.       Prechecks 0 findbugs 0m 0s Findbugs executables are not available. +1 hbaseanti 0m 0s Patch does not have any anti-patterns. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 2 new or modified test files.       master Compile Tests +1 mvninstall 4m 36s master passed +1 compile 0m 18s master passed +1 checkstyle 0m 13s master passed +1 shadedjars 4m 23s branch has no errors when building our shaded downstream artifacts. +1 javadoc 0m 13s master passed       Patch Compile Tests +1 mvninstall 4m 14s the patch passed +1 compile 0m 18s the patch passed +1 javac 0m 18s the patch passed -1 checkstyle 0m 14s hbase-backup: The patch generated 1 new + 71 unchanged - 11 fixed = 72 total (was 82) -1 whitespace 0m 0s The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix <<patch_file>>. Refer https://git-scm.com/docs/git-apply +1 shadedjars 4m 9s patch has no errors when building our shaded downstream artifacts. +1 hadoopcheck 18m 22s Patch does not cause any errors with Hadoop 2.6.5 2.7.4 or 3.0.0. +1 javadoc 0m 13s the patch passed       Other Tests +1 unit 16m 14s hbase-backup in the patch passed. +1 asflicense 0m 11s The patch does not generate ASF License warnings. 49m 52s Subsystem Report/Notes Docker Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:eee3b01 JIRA Issue HBASE-19568 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12905579/HBASE-19568-v4.patch Optional Tests asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile uname Linux 71e56e77da47 4.4.0-89-generic #112-Ubuntu SMP Mon Jul 31 19:38:41 UTC 2017 x86_64 GNU/Linux Build tool maven Personality /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh git revision master / ee3accb370 maven version: Apache Maven 3.5.2 (138edd61fd100ec658bfa2d307c43b76940a5d7d; 2017-10-18T07:58:13Z) Default Java 1.8.0_151 checkstyle https://builds.apache.org/job/PreCommit-HBASE-Build/10999/artifact/patchprocess/diff-checkstyle-hbase-backup.txt whitespace https://builds.apache.org/job/PreCommit-HBASE-Build/10999/artifact/patchprocess/whitespace-eol.txt Test Results https://builds.apache.org/job/PreCommit-HBASE-Build/10999/testReport/ modules C: hbase-backup U: hbase-backup Console output https://builds.apache.org/job/PreCommit-HBASE-Build/10999/console Powered by Apache Yetus 0.6.0 http://yetus.apache.org This message was automatically generated.
            elserj Josh Elser added a comment -

            LGTM, can fix the whitespace on commit.

            elserj Josh Elser added a comment - LGTM, can fix the whitespace on commit.
            elserj Josh Elser added a comment -

            Pushed. Thanks for the patch, Vlad!

            elserj Josh Elser added a comment - Pushed. Thanks for the patch, Vlad!
            hudson Hudson added a comment -

            FAILURE: Integrated in Jenkins build HBase-Trunk_matrix #4390 (See https://builds.apache.org/job/HBase-Trunk_matrix/4390/)
            HBASE-19568: Restore of HBase table using incremental backup doesn't (elserj: rev a5601c8eac6bfcac7d869574547f505d44e49065)

            • (edit) hbase-backup/src/main/java/org/apache/hadoop/hbase/backup/impl/TableBackupClient.java
            • (edit) hbase-backup/src/main/java/org/apache/hadoop/hbase/backup/impl/IncrementalTableBackupClient.java
            • (edit) hbase-backup/src/test/java/org/apache/hadoop/hbase/backup/TestIncrementalBackupWithBulkLoad.java
            • (edit) hbase-backup/src/main/java/org/apache/hadoop/hbase/backup/impl/BackupSystemTable.java
            • (edit) hbase-backup/src/main/java/org/apache/hadoop/hbase/backup/impl/BackupManager.java
            • (edit) hbase-backup/src/main/java/org/apache/hadoop/hbase/backup/impl/RestoreTablesClient.java
            • (edit) hbase-backup/src/main/java/org/apache/hadoop/hbase/backup/impl/BackupAdminImpl.java
            • (edit) hbase-backup/src/test/java/org/apache/hadoop/hbase/backup/TestBackupBase.java
            hudson Hudson added a comment - FAILURE: Integrated in Jenkins build HBase-Trunk_matrix #4390 (See https://builds.apache.org/job/HBase-Trunk_matrix/4390/ ) HBASE-19568 : Restore of HBase table using incremental backup doesn't (elserj: rev a5601c8eac6bfcac7d869574547f505d44e49065) (edit) hbase-backup/src/main/java/org/apache/hadoop/hbase/backup/impl/TableBackupClient.java (edit) hbase-backup/src/main/java/org/apache/hadoop/hbase/backup/impl/IncrementalTableBackupClient.java (edit) hbase-backup/src/test/java/org/apache/hadoop/hbase/backup/TestIncrementalBackupWithBulkLoad.java (edit) hbase-backup/src/main/java/org/apache/hadoop/hbase/backup/impl/BackupSystemTable.java (edit) hbase-backup/src/main/java/org/apache/hadoop/hbase/backup/impl/BackupManager.java (edit) hbase-backup/src/main/java/org/apache/hadoop/hbase/backup/impl/RestoreTablesClient.java (edit) hbase-backup/src/main/java/org/apache/hadoop/hbase/backup/impl/BackupAdminImpl.java (edit) hbase-backup/src/test/java/org/apache/hadoop/hbase/backup/TestBackupBase.java

            People

              vrodionov Vladimir Rodionov
              romil.choksi Romil Choksi
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: