Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-4094

improve hbck tool to fix more hbase problem

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Closed
    • Major
    • Resolution: Duplicate
    • 0.90.3
    • None
    • master
    • None
    • Reviewed

    Description

      The hbck tool(org.apache.hadoop.hbase.util.HBaseFsck) can check and repair consistency problem.
      some error just be checked but not supply the way to repair, I plan to fix it by other tool(close_region...)or by new method.
      First, list it and discuss that is it right?

      Part A:check meta info
      1.errors.reportError(ERROR_CODE.NULL_ROOT_REGION,"Root Region or some of its attributes are null.");
      ------> after delete the root table,execute hbck tool to check but the tool run error. how to reproduce this error?

      2.errors.reportError(ERROR_CODE.NO_META_REGION, ".META. is not found on any region.");
      ------>after delete the meta table,execute hbck tool to check but the tool run error. how to reproduce this error?

      3.errors.reportError(ERROR_CODE.MULTI_META_REGION, ".META. is found on more than one region.");
      ----->the logic:scan the root table to get META table regioninfo,if META table's regions is more than one,throw the error.
      HBase allow META table has more than one region,is it?

      Part B:check Consistency
      4.ERROR_CODE.NOT_IN_META_HDFS---->close it from regionserver.

      5.ERROR_CODE.NOT_IN_META_OR_DEPLOYED---->do nothing,maybe it will be used to fix the chain hole in part C.

      6.ERROR_CODE.NOT_IN_META---->close it from regionserver.

      7.ERROR_CODE.NOT_IN_HDFS_OR_DEPLOYED---->delete it from META table,it will make a chain hole, when check chain integrity(in part C) to fix it.

      8.ERROR_CODE.NOT_IN_HDFS---->delete it from META table and close it from regionserver,when check chain integrity(in part C) to fix it.

      9.ERROR_CODE.NOT_DEPLOYED---->assign it.

      10.ERROR_CODE.SHOULD_NOT_BE_DEPLOYED---->delete if from META table and close it from regionserver.

      11.ERROR_CODE.MULTI_DEPLOYED--->close all from regionservers,and reassign it.

      12.ERROR_CODE.SERVER_DOES_NOT_MATCH_META---->close all from regionservers,and reassign it.

      Part C:check chain Integrity
      13.ERROR_CODE.FIRST_REGION_STARTKEY_NOT_EMPTY--->treat it as a hole problem(ERROR_CODE.HOLE_IN_REGION_CHAIN).

      14.ERROR_CODE.LAST_REGION_ENDKEY_NOT_EMPTY(new add)--->treat it as a hole problem(ERROR_CODE.HOLE_IN_REGION_CHAIN).

      15.ERROR_CODE.REGION_CYCLE---->shut down cluster and merge two region by merge tool(org.apache.hadoop.hbase.util.Merge)

      16.ERROR_CODE.DUPE_STARTKEYS--->shut down cluster and merge two region by merge tool(org.apache.hadoop.hbase.util.Merge)

      17.ERROR_CODE.OVERLAP_IN_REGION_CHAIN--->shut down cluster and merge two region by merge tool(org.apache.hadoop.hbase.util.Merge)

      18.ERROR_CODE.HOLE_IN_REGION_CHAIN--->write a new method to fix it,the logic is:for recover the data,collect the regionfo from regionserver and hdfs.if a region's key range is overlaping with the hole range,put it in META table and assign it,maybe it will create overlapping problem,we can fix it by merge tool.if no region be collected,create a new region by the hole key range to fix it.

      Attachments

        1. HbaseFsck_TableChain.patch
          1.0 kB
          feng xu

        Issue Links

          Activity

            People

              Unassigned Unassigned
              feng xu feng xu
              Votes:
              1 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - 12h
                  12h
                  Remaining:
                  Remaining Estimate - 12h
                  12h
                  Logged:
                  Time Spent - Not Specified
                  Not Specified