Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-21745

Make HBCK2 be able to fix issues other than region assignment

    XMLWordPrintableJSON

Details

    • Reviewed
    • Hide
      This issue adds via its subtasks:

       * An 'HBCK Report' page to the Master UI added by HBASE-22527+HBASE-22709+HBASE-22723+ (since 2.1.6, 2.2.1, 2.3.0). Lists consistency or anomalies found via new hbase:meta consistency checking extensions added to CatalogJanitor (holes, overlaps, bad servers) and by a new 'HBCK chore' that runs at a lesser periodicity that will note filesystem orphans and overlaps as well as the following conditions:
       ** Master thought this region opened, but no regionserver reported it.
       ** Master thought this region opened on Server1, but regionserver reported Server2
       ** More than one regionservers reported opened this region
       Both chores can be triggered from the shell to regenerate ‘new’ reports.
       * Means of scheduling a ServerCrashProcedure (HBASE-21393).
       * An ‘offline’ hbase:meta rebuild (HBASE-22680).
       * Offline replace of hbase.version and hbase.id
       * Documentation on how to use completebulkload tool to ‘adopt’ orphaned data found by new HBCK2 ‘filesystem’ check (see below) and ‘HBCK chore’ (HBASE-22859)
       * A ‘holes’ and ‘overlaps’ fix that runs in the master that uses new bulk-merge facility to collapse many overlaps in the one go.
       * hbase-operator-tools HBCK2 client tool got a bunch of additions:
       ** A specialized 'fix' for the case where operators ran old hbck 'offlinemeta' repair and destroyed their hbase:meta; it ties together holes in meta with orphaned data in the fs (HBASE-22567)
       ** A ‘filesystem’ command that reports on orphan data as well as bad references and hlinks with a ‘fix’ for the latter two options (based on hbck1 facility updated).
       ** Adds back the ‘replication’ fix facility from hbck1 (HBASE-22717)

      The compound result is that hbck2 is now in excess of hbck1 abilities. The provided functionality is disaggregated as per the hbck2 philosophy of providing 'plumbing' rather than 'porcelain' so there is work to do still adding fix-it playbooks, scripting across outages, and automation.
      Show
      This issue adds via its subtasks:  * An 'HBCK Report' page to the Master UI added by HBASE-22527 + HBASE-22709 + HBASE-22723 + (since 2.1.6, 2.2.1, 2.3.0). Lists consistency or anomalies found via new hbase:meta consistency checking extensions added to CatalogJanitor (holes, overlaps, bad servers) and by a new 'HBCK chore' that runs at a lesser periodicity that will note filesystem orphans and overlaps as well as the following conditions:  ** Master thought this region opened, but no regionserver reported it.  ** Master thought this region opened on Server1, but regionserver reported Server2  ** More than one regionservers reported opened this region  Both chores can be triggered from the shell to regenerate ‘new’ reports.  * Means of scheduling a ServerCrashProcedure ( HBASE-21393 ).  * An ‘offline’ hbase:meta rebuild ( HBASE-22680 ).  * Offline replace of hbase.version and hbase.id  * Documentation on how to use completebulkload tool to ‘adopt’ orphaned data found by new HBCK2 ‘filesystem’ check (see below) and ‘HBCK chore’ ( HBASE-22859 )  * A ‘holes’ and ‘overlaps’ fix that runs in the master that uses new bulk-merge facility to collapse many overlaps in the one go.  * hbase-operator-tools HBCK2 client tool got a bunch of additions:  ** A specialized 'fix' for the case where operators ran old hbck 'offlinemeta' repair and destroyed their hbase:meta; it ties together holes in meta with orphaned data in the fs ( HBASE-22567 )  ** A ‘filesystem’ command that reports on orphan data as well as bad references and hlinks with a ‘fix’ for the latter two options (based on hbck1 facility updated).  ** Adds back the ‘replication’ fix facility from hbck1 ( HBASE-22717 ) The compound result is that hbck2 is now in excess of hbck1 abilities. The provided functionality is disaggregated as per the hbck2 philosophy of providing 'plumbing' rather than 'porcelain' so there is work to do still adding fix-it playbooks, scripting across outages, and automation.

    Description

      This is what apurtell posted on mailing-list, HBCK2 should support

      Attachments

        Issue Links

          1.
          [hbck2] Add a master web ui to show the problematic regions Sub-task Resolved Guanghao Zhang
          2.
          Fix failed split and merge transactions that have failed to roll back Sub-task Resolved Jingyun Tian
          3.
          Fix region holes, overlaps, and other region related errors Sub-task Resolved Jingyun Tian
          4.
          Add an API ScheduleSCP() to HBCK2 Sub-task Resolved Jingyun Tian
          5.
          [HBCK2] OfflineMetaRepair for hbase2/hbck2 Sub-task Resolved Michael Stack
          6.
          Add to migration doc that meta should be healthy before upgrade Sub-task Resolved Michael Stack
          7.
          [HBCK2] Add filesystem fixup to hbck2 Sub-task Resolved Michael Stack
          8.
          HBCK - Add offline create/fix hbase.version and hbase.id Sub-task Closed xufeng
          9.
          Avoid to expose protobuf stuff in Hbck interface Sub-task Resolved Guanghao Zhang
          10.
          Add a chore thread in master to do hbck checking and display results in 'HBCK Report' page Sub-task Resolved Guanghao Zhang
          11.
          [HBCK2] Add hdfs integrity report to 'filesystem' command Sub-task Resolved Michael Stack
          12.
          [HBCK2] Expose replication fixes from hbck1 Sub-task Resolved Michael Stack
          13.
          Have CatalogJanitor report holes and overlaps; i.e. problems it sees when doing its regular scan of hbase:meta Sub-task Resolved Michael Stack
          14.
          Add a new admin method and shell cmd to trigger the hbck chore to run Sub-task Resolved Guanghao Zhang
          15.
          Show catalogjanitor consistency complaints in new 'HBCK Report' page Sub-task Resolved Michael Stack
          16.
          [HBCK2] Add more log for hbck operations at master side Sub-task Resolved Guanghao Zhang
          17.
          [HBCK2] fixMeta method and server-side support Sub-task Resolved Michael Stack
          18.
          Add a multi-region merge (for fixing overlaps, etc.) Sub-task Resolved Michael Stack
          19.
          [HBCK2] Add fix of overlaps to fixMeta hbck Service Sub-task Resolved Sakthi
          20.
          Modify config value range to enable turning off of the hbck chore Sub-task Resolved Sakthi
          21.
          Fix broken unit test, TestCatalogJanitorCluster on branch-2.1 and branch-2.0 Sub-task Resolved Michael Stack
          22.
          HBCK Report showed wrong orphans regions on FileSystem Sub-task Resolved Guanghao Zhang
          23.
          HBCK Report showed the offline regions which belong to disabled table Sub-task Resolved Guanghao Zhang
          24.
          Show filesystem path for the orphans regions on filesystem Sub-task Resolved Guanghao Zhang
          25.
          [HBCK2] Add a client-side to hbase-operator-tools that can exploit fixMeta added in server side Sub-task Resolved Sakthi
          26.
          [HBCK2] Fix HBCK2 after HBASE-22777 & HBASE-22758 Sub-task Resolved Sakthi
          27.
          Revert MetaTableAccessor#makePutFromTableState access to public Sub-task Resolved Sakthi
          28.
          Add HBCK Report to master's header.jsp Sub-task Resolved Guanghao Zhang
          29.
          [HBCK2] Fix the orphan regions on filesystem Sub-task Resolved Michael Stack
          30.
          [HBCK2] shows the whole help/usage message after the error message Sub-task Closed Sakthi
          31.
          [HBCK2] reference file check fails if compiled with old version but check against new Sub-task Resolved Michael Stack
          32.
          Move to SLF4J Sub-task Resolved Peter Somogyi
          33.
          Fix NOTICE and LICENSE Sub-task Resolved Peter Somogyi
          34.
          Should not show split parent regions in hbck report UI Sub-task Resolved Guanghao Zhang
          35.
          HBCK report UI showed -1 if hbck chore not running Sub-task Resolved Guanghao Zhang

          Activity

            People

              stack Michael Stack
              zhangduo Duo Zhang
              Votes:
              0 Vote for this issue
              Watchers:
              28 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: