Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-19934

HBaseSnapshotException when read replicas is enabled and online snapshot is taken after region splitting

    XMLWordPrintableJSON

Details

    • Reviewed

    Description

      InvestigatingĀ HBASE-19893, I'm encountering another issue.

      Steps to reproduce are as follows:

      1. Create a table

      create "test", "cf", {REGION_REPLICATION => 2}

      2. Load data to the table

      (0...2000).each{|i| put "test", "row#{i}", "cf:col", "val"}

      3. Split the table

      split "test"

      4. Take a snapshot for the table

      snapshot "test", "snap"

      And I encountered the following error:

      hbase(main):004:0> snapshot "test", "snap"
      
      ERROR: org.apache.hadoop.hbase.snapshot.HBaseSnapshotException: Snapshot { ss=snap table=test type=FLUSH } had an error. Procedure snap { waiting=[] done=[] }
      at org.apache.hadoop.hbase.master.snapshot.SnapshotManager.isSnapshotDone(SnapshotManager.java:379)
      at org.apache.hadoop.hbase.master.MasterRpcServices.isSnapshotDone(MasterRpcServices.java:1144)
      at org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java)
      at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:406)
      at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
      at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
      at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
      Caused by: org.apache.hadoop.hbase.snapshot.CorruptedSnapshotException via Failed taking snapshot { ss=snap table=test type=FLUSH } due to exception:Manifest region info {ENCODED => b910488a686644a7c1c85246d0d123d5, NAME => 'test,,1517808523837_0001.b910488a686644a7c1c85246d0d123d5.', STARTKEY => '', ENDKEY => '', OFFLINE => true, SPLIT => true, REPLICA_ID => 1}doesn't match expected region:{ENCODED => ef8665859c0b19927b7dc127ec10120a, NAME => 'test,,1517808523837.ef8665859c0b19927b7dc127ec10120a.', STARTKEY => '', ENDKEY => '', OFFLINE => true, SPLIT => true}:org.apache.hadoop.hbase.snapshot.CorruptedSnapshotException: Manifest region info {ENCODED => b910488a686644a7c1c85246d0d123d5, NAME => 'test,,1517808523837_0001.b910488a686644a7c1c85246d0d123d5.', STARTKEY => '', ENDKEY => '', OFFLINE => true, SPLIT => true, REPLICA_ID => 1}doesn't match expected region:{ENCODED => ef8665859c0b19927b7dc127ec10120a, NAME => 'test,,1517808523837.ef8665859c0b19927b7dc127ec10120a.', STARTKEY => '', ENDKEY => '', OFFLINE => true, SPLIT => true}
      at org.apache.hadoop.hbase.errorhandling.ForeignExceptionDispatcher.rethrowException(ForeignExceptionDispatcher.java:82)
      at org.apache.hadoop.hbase.master.snapshot.TakeSnapshotHandler.rethrowExceptionIfFailed(TakeSnapshotHandler.java:306)
      at org.apache.hadoop.hbase.master.snapshot.SnapshotManager.isSnapshotDone(SnapshotManager.java:368)
      ... 6 more
      Caused by: org.apache.hadoop.hbase.snapshot.CorruptedSnapshotException: Manifest region info {ENCODED => b910488a686644a7c1c85246d0d123d5, NAME => 'test,,1517808523837_0001.b910488a686644a7c1c85246d0d123d5.', STARTKEY => '', ENDKEY => '', OFFLINE => true, SPLIT => true, REPLICA_ID => 1}doesn't match expected region:{ENCODED => ef8665859c0b19927b7dc127ec10120a, NAME => 'test,,1517808523837.ef8665859c0b19927b7dc127ec10120a.', STARTKEY => '', ENDKEY => '', OFFLINE => true, SPLIT => true}
      at org.apache.hadoop.hbase.master.snapshot.MasterSnapshotVerifier.verifyRegionInfo(MasterSnapshotVerifier.java:223)
      at org.apache.hadoop.hbase.master.snapshot.MasterSnapshotVerifier.verifyRegions(MasterSnapshotVerifier.java:201)
      at org.apache.hadoop.hbase.master.snapshot.MasterSnapshotVerifier.verifySnapshot(MasterSnapshotVerifier.java:119)
      at org.apache.hadoop.hbase.master.snapshot.TakeSnapshotHandler.process(TakeSnapshotHandler.java:202)
      at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:104)
      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
      at java.lang.Thread.run(Thread.java:745)
      
      Take a snapshot of specified table. Examples:
      
      hbase> snapshot 'sourceTable', 'snapshotName'
      hbase> snapshot 'namespace:sourceTable', 'snapshotName', {SKIP_FLUSH => true}
      
      Took 0.3390 seconds

      Attachments

        1. HBASE-19934.branch-1.001.patch
          6 kB
          Toshihiro Suzuki
        2. HBASE-19934-branch-1.patch
          5 kB
          Toshihiro Suzuki
        3. HBASE-19934-v3.patch
          5 kB
          Toshihiro Suzuki
        4. HBASE-19934-v3.patch
          5 kB
          Toshihiro Suzuki
        5. HBASE-19934-v2.patch
          4 kB
          Toshihiro Suzuki
        6. HBASE-19934.patch
          4 kB
          Toshihiro Suzuki
        7. HBASE-19934.patch
          4 kB
          Toshihiro Suzuki
        8. HBASE-19934.patch
          4 kB
          Toshihiro Suzuki
        9. HBASE-19934.patch
          4 kB
          Toshihiro Suzuki
        10. HBASE-19934-UT.patch
          3 kB
          Toshihiro Suzuki

        Issue Links

          Activity

            People

              brfrn169 Toshihiro Suzuki
              brfrn169 Toshihiro Suzuki
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: