Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-28602

Incremental backup fails when WALs move

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 2.6.0, 3.0.0-beta-1, 4.0.0-alpha-1, 2.7.0
    • None
    • backup&restore
    • None

    Description

      The incremental back process appears to collect a set of WAL files to operate over and then proceed to do so. In between a file moves. This causes the backup to fail. This is reproducible as a flakey unit test, as we see in TestIncrementalBackup.TestIncBackupRestore,

      java.io.IOException: java.io.FileNotFoundException: File hdfs://localhost:39577/user/jenkins/test-data/f51646e4-e3e0-ef30-df2b-aa2a22ed41c3/WALs/94f4fe62ee7a,40249,1715620734331/94f4fe62ee7a%2C40249%2C1715620734331.94f4fe62ee7a%2C40249%2C1715620734331.regiongroup-0.1715620773674 does not exist.
      	at org.apache.hadoop.hbase.backup.impl.IncrementalTableBackupClient.execute(IncrementalTableBackupClient.java:289)
      	at org.apache.hadoop.hbase.backup.impl.BackupAdminImpl.backupTables(BackupAdminImpl.java:595)
      	at org.apache.hadoop.hbase.backup.TestIncrementalBackup.TestIncBackupRestore(TestIncrementalBackup.java:169)
      	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
      	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      	at java.base/java.lang.reflect.Method.invoke(Method.java:568)
      	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
      	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
      	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
      	at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
      	at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
      	at org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
      	at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
      	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
      	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
      	at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
      	at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
      	at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
      	at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
      	at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
      	at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
      	at org.junit.runners.Suite.runChild(Suite.java:128)
      	at org.junit.runners.Suite.runChild(Suite.java:27)
      	at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
      	at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
      	at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
      	at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
      	at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
      	at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
      	at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
      	at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
      	at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
      	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
      	at java.base/java.lang.Thread.run(Thread.java:840)
      Caused by: java.io.FileNotFoundException: File hdfs://localhost:39577/user/jenkins/test-data/f51646e4-e3e0-ef30-df2b-aa2a22ed41c3/WALs/94f4fe62ee7a,40249,1715620734331/94f4fe62ee7a%2C40249%2C1715620734331.94f4fe62ee7a%2C40249%2C1715620734331.regiongroup-0.1715620773674 does not exist.
      	at org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.<init>(DistributedFileSystem.java:1282)
      	at org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.<init>(DistributedFileSystem.java:1256)
      	at org.apache.hadoop.hdfs.DistributedFileSystem$25.doCall(DistributedFileSystem.java:1201)
      	at org.apache.hadoop.hdfs.DistributedFileSystem$25.doCall(DistributedFileSystem.java:1197)
      	at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
      	at org.apache.hadoop.hdfs.DistributedFileSystem.listLocatedStatus(DistributedFileSystem.java:1215)
      	at org.apache.hadoop.fs.FileSystem.listLocatedStatus(FileSystem.java:2230)
      	at org.apache.hadoop.hbase.mapreduce.WALInputFormat.getFiles(WALInputFormat.java:356)
      	at org.apache.hadoop.hbase.mapreduce.WALInputFormat.getSplits(WALInputFormat.java:321)
      	at org.apache.hadoop.hbase.mapreduce.WALInputFormat.getSplits(WALInputFormat.java:301)
      	at org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:310)
      	at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:327)
      	at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:200)
      	at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1678)
      	at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1675)
      	at java.base/java.security.AccessController.doPrivileged(AccessController.java:712)
      	at java.base/javax.security.auth.Subject.doAs(Subject.java:439)
      	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899)
      	at org.apache.hadoop.mapreduce.Job.submit(Job.java:1675)
      	at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1696)
      	at org.apache.hadoop.hbase.mapreduce.WALPlayer.run(WALPlayer.java:423)
      	at org.apache.hadoop.hbase.backup.impl.IncrementalTableBackupClient.walToHFiles(IncrementalTableBackupClient.java:406)
      	at org.apache.hadoop.hbase.backup.impl.IncrementalTableBackupClient.convertWALsToHFiles(IncrementalTableBackupClient.java:378)
      	at org.apache.hadoop.hbase.backup.impl.IncrementalTableBackupClient.execute(IncrementalTableBackupClient.java:282)
      	... 34 more
      

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              ndimiduk Nick Dimiduk
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated: