Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-21745

Make HBCK2 be able to fix issues other than region assignment

VotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Reviewed
    • Hide
      This issue adds via its subtasks:

       * An 'HBCK Report' page to the Master UI added by HBASE-22527+HBASE-22709+HBASE-22723+ (since 2.1.6, 2.2.1, 2.3.0). Lists consistency or anomalies found via new hbase:meta consistency checking extensions added to CatalogJanitor (holes, overlaps, bad servers) and by a new 'HBCK chore' that runs at a lesser periodicity that will note filesystem orphans and overlaps as well as the following conditions:
       ** Master thought this region opened, but no regionserver reported it.
       ** Master thought this region opened on Server1, but regionserver reported Server2
       ** More than one regionservers reported opened this region
       Both chores can be triggered from the shell to regenerate ‘new’ reports.
       * Means of scheduling a ServerCrashProcedure (HBASE-21393).
       * An ‘offline’ hbase:meta rebuild (HBASE-22680).
       * Offline replace of hbase.version and hbase.id
       * Documentation on how to use completebulkload tool to ‘adopt’ orphaned data found by new HBCK2 ‘filesystem’ check (see below) and ‘HBCK chore’ (HBASE-22859)
       * A ‘holes’ and ‘overlaps’ fix that runs in the master that uses new bulk-merge facility to collapse many overlaps in the one go.
       * hbase-operator-tools HBCK2 client tool got a bunch of additions:
       ** A specialized 'fix' for the case where operators ran old hbck 'offlinemeta' repair and destroyed their hbase:meta; it ties together holes in meta with orphaned data in the fs (HBASE-22567)
       ** A ‘filesystem’ command that reports on orphan data as well as bad references and hlinks with a ‘fix’ for the latter two options (based on hbck1 facility updated).
       ** Adds back the ‘replication’ fix facility from hbck1 (HBASE-22717)

      The compound result is that hbck2 is now in excess of hbck1 abilities. The provided functionality is disaggregated as per the hbck2 philosophy of providing 'plumbing' rather than 'porcelain' so there is work to do still adding fix-it playbooks, scripting across outages, and automation.
      Show
      This issue adds via its subtasks:  * An 'HBCK Report' page to the Master UI added by HBASE-22527 + HBASE-22709 + HBASE-22723 + (since 2.1.6, 2.2.1, 2.3.0). Lists consistency or anomalies found via new hbase:meta consistency checking extensions added to CatalogJanitor (holes, overlaps, bad servers) and by a new 'HBCK chore' that runs at a lesser periodicity that will note filesystem orphans and overlaps as well as the following conditions:  ** Master thought this region opened, but no regionserver reported it.  ** Master thought this region opened on Server1, but regionserver reported Server2  ** More than one regionservers reported opened this region  Both chores can be triggered from the shell to regenerate ‘new’ reports.  * Means of scheduling a ServerCrashProcedure ( HBASE-21393 ).  * An ‘offline’ hbase:meta rebuild ( HBASE-22680 ).  * Offline replace of hbase.version and hbase.id  * Documentation on how to use completebulkload tool to ‘adopt’ orphaned data found by new HBCK2 ‘filesystem’ check (see below) and ‘HBCK chore’ ( HBASE-22859 )  * A ‘holes’ and ‘overlaps’ fix that runs in the master that uses new bulk-merge facility to collapse many overlaps in the one go.  * hbase-operator-tools HBCK2 client tool got a bunch of additions:  ** A specialized 'fix' for the case where operators ran old hbck 'offlinemeta' repair and destroyed their hbase:meta; it ties together holes in meta with orphaned data in the fs ( HBASE-22567 )  ** A ‘filesystem’ command that reports on orphan data as well as bad references and hlinks with a ‘fix’ for the latter two options (based on hbck1 facility updated).  ** Adds back the ‘replication’ fix facility from hbck1 ( HBASE-22717 ) The compound result is that hbck2 is now in excess of hbck1 abilities. The provided functionality is disaggregated as per the hbck2 philosophy of providing 'plumbing' rather than 'porcelain' so there is work to do still adding fix-it playbooks, scripting across outages, and automation.

    Description

      This is what Andrew Kyle Purtell posted on mailing-list, HBCK2 should support

      Attachments

        Issue Links

        1.
        [hbck2] Add a master web ui to show the problematic regions Sub-task Resolved Guanghao Zhang Actions
        2.
        Fix failed split and merge transactions that have failed to roll back Sub-task Resolved Jingyun Tian Actions
        3.
        Fix region holes, overlaps, and other region related errors Sub-task Resolved Jingyun Tian Actions
        4.
        Add an API ScheduleSCP() to HBCK2 Sub-task Resolved Jingyun Tian Actions
        5.
        [HBCK2] OfflineMetaRepair for hbase2/hbck2 Sub-task Resolved Michael Stack Actions
        6.
        Add to migration doc that meta should be healthy before upgrade Sub-task Resolved Michael Stack Actions
        7.
        [HBCK2] Add filesystem fixup to hbck2 Sub-task Resolved Michael Stack Actions
        8.
        HBCK - Add offline create/fix hbase.version and hbase.id Sub-task Closed xufeng Actions
        9.
        Avoid to expose protobuf stuff in Hbck interface Sub-task Resolved Guanghao Zhang Actions
        10.
        Add a chore thread in master to do hbck checking and display results in 'HBCK Report' page Sub-task Resolved Guanghao Zhang Actions
        11.
        [HBCK2] Add hdfs integrity report to 'filesystem' command Sub-task Resolved Michael Stack Actions
        12.
        [HBCK2] Expose replication fixes from hbck1 Sub-task Resolved Michael Stack Actions
        13.
        Have CatalogJanitor report holes and overlaps; i.e. problems it sees when doing its regular scan of hbase:meta Sub-task Resolved Michael Stack Actions
        14.
        Add a new admin method and shell cmd to trigger the hbck chore to run Sub-task Resolved Guanghao Zhang Actions
        15.
        Show catalogjanitor consistency complaints in new 'HBCK Report' page Sub-task Resolved Michael Stack Actions
        16.
        [HBCK2] Add more log for hbck operations at master side Sub-task Resolved Guanghao Zhang Actions
        17.
        [HBCK2] fixMeta method and server-side support Sub-task Resolved Michael Stack Actions
        18.
        Add a multi-region merge (for fixing overlaps, etc.) Sub-task Resolved Michael Stack Actions
        19.
        [HBCK2] Add fix of overlaps to fixMeta hbck Service Sub-task Resolved Sakthi Actions
        20.
        Modify config value range to enable turning off of the hbck chore Sub-task Resolved Sakthi Actions
        21.
        Fix broken unit test, TestCatalogJanitorCluster on branch-2.1 and branch-2.0 Sub-task Resolved Michael Stack Actions
        22.
        HBCK Report showed wrong orphans regions on FileSystem Sub-task Resolved Guanghao Zhang Actions
        23.
        HBCK Report showed the offline regions which belong to disabled table Sub-task Resolved Guanghao Zhang Actions
        24.
        Show filesystem path for the orphans regions on filesystem Sub-task Resolved Guanghao Zhang Actions
        25.
        [HBCK2] Add a client-side to hbase-operator-tools that can exploit fixMeta added in server side Sub-task Resolved Sakthi Actions
        26.
        [HBCK2] Fix HBCK2 after HBASE-22777 & HBASE-22758 Sub-task Resolved Sakthi Actions
        27.
        Revert MetaTableAccessor#makePutFromTableState access to public Sub-task Resolved Sakthi Actions
        28.
        Add HBCK Report to master's header.jsp Sub-task Resolved Guanghao Zhang Actions
        29.
        [HBCK2] Fix the orphan regions on filesystem Sub-task Resolved Michael Stack Actions
        30.
        [HBCK2] shows the whole help/usage message after the error message Sub-task Closed Sakthi Actions
        31.
        [HBCK2] reference file check fails if compiled with old version but check against new Sub-task Resolved Michael Stack Actions
        32.
        Move to SLF4J Sub-task Resolved Peter Somogyi Actions
        33.
        Fix NOTICE and LICENSE Sub-task Resolved Peter Somogyi Actions
        34.
        Should not show split parent regions in hbck report UI Sub-task Resolved Guanghao Zhang Actions
        35.
        HBCK report UI showed -1 if hbck chore not running Sub-task Resolved Guanghao Zhang Actions

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            stack Michael Stack
            zhangduo Duo Zhang
            Votes:
            0 Vote for this issue
            Watchers:
            27 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment