Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-15058

AssignmentManager should account for unsuccessful split correctly which initially passes quota check

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.2.0
    • 1.2.0, 1.3.0
    • None
    • None
    • Reviewed

    Description

      When region split doesn't pass quota check, we would see exception similar to the following:

      2015-12-29 16:07:33,653 INFO  [RS:0;10.21.128.189:57449-splits-1451434041585] regionserver.SplitRequest(97): Running rollback/cleanup of failed split of np2:                     testRegionNormalizationSplitOnCluster,zzzzz,1451434045065.27cccb3fae03002b8058beef61cb7c20.; Failed to get ok from master to split np2:testRegionNormalizationSplitOnCluster,     zzzzz,1451434045065.27cccb3fae03002b8058beef61cb7c20.
      java.io.IOException: Failed to get ok from master to split np2:testRegionNormalizationSplitOnCluster,zzzzz,1451434045065.27cccb3fae03002b8058beef61cb7c20.
        at org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.stepsBeforePONR(SplitTransactionImpl.java:345)
        at org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.createDaughters(SplitTransactionImpl.java:262)
        at org.apache.hadoop.hbase.regionserver.SplitTransactionImpl.execute(SplitTransactionImpl.java:502)
        at org.apache.hadoop.hbase.regionserver.SplitRequest.doSplitting(SplitRequest.java:82)
        at org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:155)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
      

      However, region split may fail for subsequent SplitTransactionPhase's in stepsBeforePONR().
      Currently in branch-1, the distinction among the following TransitionCode's is not clear in AssignmentManager#onRegionTransition():

          case SPLIT_PONR:
          case SPLIT:
          case SPLIT_REVERTED:
            errorMsg =
                onRegionSplit(serverName, code, hri, HRegionInfo.convert(transition.getRegionInfo(1)),
                  HRegionInfo.convert(transition.getRegionInfo(2)));
            if (org.apache.commons.lang.StringUtils.isEmpty(errorMsg)) {
              try {
                regionStateListener.onRegionSplitReverted(hri);
      

      onRegionSplit() handles the above 3 TransitionCode's. However, errorMsg is normally null (onRegionSplit returns null at the end).
      This would result in onRegionSplitReverted() being called for cases of SPLIT_PONR and SPLIT.

      When region split fails, AssignmentManager#onRegionTransition() should account for the failure properly so that quota bookkeeping is consistent.

      Attachments

        1. 15058-branch-1-v1.txt
          3 kB
          Ted Yu
        2. 15058-branch-1-v2.txt
          3 kB
          Ted Yu

        Activity

          People

            yuzhihong@gmail.com Ted Yu
            yuzhihong@gmail.com Ted Yu
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: