Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-13576

HBCK enhancement: Failure in checking one region should not fail the entire HBCK operation.

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 1.1.0, 1.2.0, 2.0.0
    • 1.2.0, 1.1.1, 2.0.0
    • hbck
    • None
    • Reviewed

    Description

      HBaseFsck#checkRegionConsistency() checks region consistency and repair the corruption if requested. However, this function expects some exceptions. For example, in one aspect of region repair, it calls HBaseFsckRepair#waitUntilAssigned(), if a region is in transition for over 120 seconds (default value of "hbase.hbck.assign.timeout" configuration), IOException would throw.

      The problem is that one exception in checkRegionConsistency() would kill entire hbck operation, because the exception would propagate up.

      The proposal is that if the region is not META region ( or a system table region if we prefer), we can skip the region if HBaseFsck#checkRegionConsistency() fails. We could print out skip regions in summary section so that users know to either re-run or investigate potential issue for that region.

      Attachments

        1. HBASE-13576.v3-master.patch
          9 kB
          Stephen Yuan Jiang
        2. HBASE-13576.v2-master.patch
          9 kB
          Stephen Yuan Jiang
        3. HBASE-13576.v1-master.patch
          8 kB
          Stephen Yuan Jiang

        Activity

          People

            syuanjiang Stephen Yuan Jiang
            syuanjiang Stephen Yuan Jiang
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: