Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-10786

If snapshot verification fails with 'Regions moved', the message should contain the name of region causing the failure

    XMLWordPrintableJSON

    Details

    • Type: Task
    • Status: Closed
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.98.1, 0.99.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      I was trying to find cause for test failure in https://builds.apache.org/job/PreCommit-HBASE-Build/9036//testReport/org.apache.hadoop.hbase.snapshot/TestSecureExportSnapshot/testExportRetry/ :

      org.apache.hadoop.hbase.snapshot.HBaseSnapshotException: org.apache.hadoop.hbase.snapshot.HBaseSnapshotException: Snapshot { ss=emptySnaptb0-1395177346656 table=testtb-1395177346656 type=FLUSH } had an error.  Procedure emptySnaptb0-1395177346656 { waiting=[] done=[] }
      	at org.apache.hadoop.hbase.master.snapshot.SnapshotManager.isSnapshotDone(SnapshotManager.java:342)
      	at org.apache.hadoop.hbase.master.HMaster.isSnapshotDone(HMaster.java:3007)
      	at org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java:40494)
      	at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2020)
      	at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
      	at org.apache.hadoop.hbase.ipc.FifoRpcScheduler$1.run(FifoRpcScheduler.java:73)
      	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
      	at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:138)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
      	at java.lang.Thread.run(Thread.java:662)
      Caused by: org.apache.hadoop.hbase.snapshot.CorruptedSnapshotException via Failed taking snapshot { ss=emptySnaptb0-1395177346656 table=testtb-1395177346656 type=FLUSH } due to exception:Regions moved during the snapshot '{ ss=emptySnaptb0-1395177346656 table=testtb-1395177346656 type=FLUSH }'. expected=9 snapshotted=8:org.apache.hadoop.hbase.snapshot.CorruptedSnapshotException: Regions moved during the snapshot '{ ss=emptySnaptb0-1395177346656 table=testtb-1395177346656 type=FLUSH }'. expected=9 snapshotted=8
      	at org.apache.hadoop.hbase.errorhandling.ForeignExceptionDispatcher.rethrowException(ForeignExceptionDispatcher.java:83)
      	at org.apache.hadoop.hbase.master.snapshot.TakeSnapshotHandler.rethrowExceptionIfFailed(TakeSnapshotHandler.java:320)
      	at org.apache.hadoop.hbase.master.snapshot.SnapshotManager.isSnapshotDone(SnapshotManager.java:332)
      	... 11 more
      

      However, it is not clear which region caused the verification to fail.
      I searched for log from balancer but found none.

      The exception message should include region name which caused the verification to fail.

        Attachments

        1. 10786-v1.txt
          2 kB
          Ted Yu
        2. 10786-v2.txt
          2 kB
          Ted Yu
        3. 10786-v3.txt
          2 kB
          Ted Yu

          Activity

            People

            • Assignee:
              yuzhihong@gmail.com Ted Yu
              Reporter:
              yuzhihong@gmail.com Ted Yu
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: