Index: src/docbkx/book.xml =================================================================== --- src/docbkx/book.xml (revision 1342084) +++ src/docbkx/book.xml (working copy) @@ -2711,6 +2711,195 @@ + + hbck In Depth + HBaseFsck (hbck) is a tool for checking for region consistency and table integrity problems +and repairing a corrupted HBase. It works in two basic modes -- a read-only inconsistency +identifying mode and a multi-phase read-write repair mode. + +
+ Running hbck to identify inconsistencies +To check to see if your HBase cluster has corruptions, run hbck against your HBase cluster: + +$ ./bin/hbase hbck + + +At the end of the commands output it prints OK or tells you the number of INCONSISTENCIES +present. You may also want to run run hbck a few times because some inconsistencies can be +transient (e.g. cluster is starting up or a region is splitting). Operationally you may want to run +hbck regularly and setup alert (e.g. via nagios) if it repeatedly reports inconsistencies . +A run of hbck will report a list of inconsistencies along with a brief description of the regions and +tables affected. The using the -details option will report more details including a representative +listing of all the splits present in all the tables. + + +$ ./bin/hbase hbck -details + +
+
Inconsistencies + + If after several runs, inconsistencies continue to be reported, you may have encountered a +corruption. These should be rare, but in the event they occur newer versions of HBase include +the hbck tool enabled with automatic repair options. + + + There are two invariants that when violated create inconsistencies in HBase: + + + HBase’s region consistency invariant is satisfied if every region is assigned and +deployed on exactly one region server, and all places where this state kept is in +accordance. + + HBase’s table integrity invariant is satisfied if for each table, every possible row key +resolves to exactly one region. + + + +Repairs generally work in three phases -- a read-only information gathering phase that identifies +inconsistencies, a table integrity repair phase that restores the table integrity invariant, and then +finally a region consistency repair phase that restores the region consistency invariant. +Starting from version 0.90.0, hbck could detect region consistency problems report on a subset +of possible table integrity problems. It also included the ability to automatically fix the most +common inconsistency, region assignment and deployment consistency problems. This repair +could be done by using the -fix command line option. These problems close regions if they are +open on the wrong server or on multiple region servers and also assigns regions to region +servers if they are not open. + + +Starting from HBase versions 0.90.7, 0.92.2 and 0.94.0, several new command line options are +introduced to aid repairing a corrupted HBase. This hbck sometimes goes by the nickname +“uberhbck”. Each particular version of uber hbck is compatible with the HBase’s of the same +major version (0.90.7 uberhbck can repair a 0.90.4). However, versions <=0.90.6 and versions +<=0.92.1 may require restarting the master or failing over to a backup master. + +
+
Localized repairs + + When repairing a corrupted HBase, it is best to repair the lowest risk inconsistencies first. +These are generally region consistency repairs -- localized single region repairs, that only modify +in-memory data, ephemeral zookeeper data, or patch holes in the META table. +Region consistency requires that the HBase instance has the state of the region’s data in HDFS +(.regioninfo files), the region’s row in the .META. table., and region’s deployment/assignments on +region servers and the master in accordance. Options for repairing region consistency include: + + -fixAssignments (equivalent to the 0.90 -fix option) repairs unassigned, incorrectly +assigned or multiply assigned regions. + + -fixMeta which removes meta rows when corresponding regions are not present in +HDFS and adds new meta rows if they regions are present in HDFS while not in META. + + + To fix deployment and assignment problems you can run this command: + + +$ ./bin/hbase hbck -fixAssignments + +To fix deployment and assignment problems as well as repairing incorrect meta rows you can +run this command:. + +$ ./bin/hbase hbck -fixAssignments -fixMeta + +There are a few classes of table integrity problems that are low risk repairs. The first two are +degenerate (startkey == endkey) regions and backwards regions (startkey > endkey). These are +automatically handled by sidelining the data to a temporary directory (/hbck/xxxx). +The third low-risk class is hdfs region holes. This can be repaired by using the: + + -fixHdfsHoles option for fabricating new empty regions on the file system. +If holes are detected you can use -fixHdfsHoles and should include -fixMeta and -fixAssignments to make the new region consistent. + + + +$ ./bin/hbase hbck -fixAssignments -fixMeta -fixHdfsHoles + +Since this is a common operation, we’ve added a the -repairHoles flag that is equivalent to the +previous command: + +$ ./bin/hbase hbck -repairHoles + +If inconsistencies still remain after these steps, you most likely have table integrity problems +related to orphaned or overlapping regions. +
+
Region Overlap Repairs +Table integrity problems can require repairs that deal with overlaps. This is a riskier operation +because it requires modifications to the file system, requires some decision making, and may +require some manual steps. For these repairs it is best to analyze the output of a hbck -details +run so that you isolate repairs attempts only upon problems the checks identify. Because this is +riskier, there are safeguard that should be used to limit the scope of the repairs. +WARNING: This is a relatively new and have only been tested on online but idle HBase instances +(no reads/writes). Use at your own risk in an active production environment! +The options for repairing table integrity violations include: + + -fixHdfsOrphans option for “adopting” a region directory that is missing a region +metadata file (the .regioninfo file). + + -fixHdfsOverlaps ability for fixing overlapping regions + + +When repairing overlapping regions, a region’s data can be modified on the file system in two +ways: 1) by merging regions into a larger region or 2) by sidelining regions by moving data to +“sideline” directory where data could be restored later. Merging a large number of regions is +technically correct but could result in an extremely large region that requires series of costly +compactions and splitting operations. In these cases, it is probably better to sideline the regions +that overlap with the most other regions (likely the largest ranges) so that merges can happen on +a more reasonable scale. Since these sidelined regions are already laid out in HBase’s native +directory and HFile format, they can be restored by using HBase’s bulk load mechanism. +The default safeguard thresholds are conservative. These options let you override the default +thresholds and to enable the large region sidelining feature. + + -maxMerge <n> maximum number of overlapping regions to merge + + -sidelineBigOverlaps if more than maxMerge regions are overlapping, sideline attempt +to sideline the regions overlapping with the most other regions. + + -maxOverlapsToSideline <n> if sidelining large overlapping regions, sideline at most n +regions. + + + +Since often times you would just want to get the tables repaired, you can use this option to turn +on all repair options: + + -repair includes all the region consistency options and only the hole repairing table +integrity options. + + +Finally, there are safeguards to limit repairs to only specific tables. For example the following +command would only attempt to repair table TableFoo and TableBar. + +$ ./bin/hbase/ hbck -repair TableFoo TableBar + +
Special cases: Meta is not properly assigned +There are a few special cases that hbck can handle as well. +Sometimes the meta table’s only region is inconsistently assigned or deployed. In this case +there is a special -fixMetaOnly option that can try to fix meta assignments. + +$ ./bin/hbase hbck -fixMetaOnly -fixAssignments + +
+
Special cases: HBase version file is missing +HBase’s data on the file system requires a version file in order to start. If this flie is missing, you +can use the -fixVersionFile option to fabricating a new HBase version file. This assumes that +the version of hbck you are running is the appropriate version for the HBase cluster. +
+
Special case: Root and META are corrupt. +The most drastic corruption scenario is the case where the ROOT or META is corrupted and +HBase will not start. In this case you can use the OfflineMetaRepair tool create new ROOT +and META regions and tables. +This tool assumes that HBase is offline. It then marches through the existing HBase home +directory, loads as much information from region metadata files (.regioninfo files) as possible +from the file system. If the region metadata has proper table integrity, it sidelines the original root +and meta table directories, and builds new ones with pointers to the region directories and their +data. + +$ ./bin/hbase org.apache.hadoop.hbase.util.OfflineMetaRepair + +NOTE: This tool is not as clever as uberhbck but can be used to bootstrap repairs that uberhbck +can complete. +If the tool succeeds you should be able to start hbase and run online repairs if necessary. +
+
+
+ Compression In HBase<indexterm><primary>Compression</primary></indexterm> Index: src/docbkx/ops_mgt.xml =================================================================== --- src/docbkx/ops_mgt.xml (revision 1342083) +++ src/docbkx/ops_mgt.xml (working copy) @@ -69,6 +69,8 @@ Passing -fix may correct the inconsistency (This latter is an experimental feature). + For more information, see . +
HFile Tool See .