diff --git a/src/main/asciidoc/_chapters/architecture.adoc b/src/main/asciidoc/_chapters/architecture.adoc index d5117db7b7..09f79c4a05 100644 --- a/src/main/asciidoc/_chapters/architecture.adoc +++ b/src/main/asciidoc/_chapters/architecture.adoc @@ -594,6 +594,67 @@ See <> for more information on region assignment. Periodically checks and cleans up the `hbase:meta` table. See <> for more information on the meta table. +[[master.wal]] +=== MasterProcWAL + +HMaster records administrative operations (a.k.a. Procedure V2) states, such as handling crashed server, table creation, and other DDLs, into its own WAL file. The WALs are stored under the MasterProcWALs directory, which is different from the RegionServer WALs. This allows for atomic operations, so for example if an HMaster that was in the middle of creating a table encounters an issue and fails, the next active HMaster can continue the operation. Since HBase 2.0 release, a new AssignmentManager (AM V2) is introduced and the HMaster handles region assignment operations via the AM V2 and persists its state in the MasterProcWALs, instead of in ZooKeeper. The design documentation of AM V2 and Procedure V2 can be found in link:https://issues.apache.org/jira/browse/HBASE-14350[HBASE-14350] and link:https://issues.apache.org/jira/browse/HBASE-12439[HBASE-12439], respectively. + +[[master.wal.conf]] +==== Configurations for MasterProcWAL + +[[hbase.procedure.store.wal.periodic.roll.msec]] +*`hbase.procedure.store.wal.periodic.roll.msec`*:: ++ +.Description +Frequency of generating a new WAL ++ +.Default +`1h (3600000 in msec)` + +[[hbase.procedure.store.wal.roll.threshold]] +*`hbase.procedure.store.wal.roll.threshold`*:: ++ +.Description +Threshold in size before the WAL rolls. Every time the WAL reaches this size or the above period, 1 hour, passes since last log roll, the HMaster will generate a new WAL. ++ +.Default +`32MB (33554432 in byte)` + +[[hbase.procedure.store.wal.warn.threshold]] +*`hbase.procedure.store.wal.warn.threshold`*:: ++ +.Description +If the number of WALs goes beyond this threshold, the following message should appear in the HMaster log with WARN level when rolling. + + procedure WALs count=xx above the warning threshold 64. check running procedures to see if something is stuck. + ++ +.Default +`64` + +[[hbase.procedure.store.wal.max.retries.before.roll]] +*`hbase.procedure.store.wal.max.retries.before.roll`*:: ++ +.Description +Max number of retry when syncing slots (records) to its underlying storage, such as HDFS. Every attempt, the following message should appear in the HMaster log. + + unable to sync slots, retry=xx + ++ +.Default +`3` + +[[hbase.procedure.store.wal.sync.failure.roll.max]] +*`hbase.procedure.store.wal.sync.failure.roll.max`*:: ++ +.Description +After the above 3 retrials, the log is rolled and the retry count is reset to 0, thereon a new set of retrial starts. This configuration controls the max number of attempts of log rolling upon sync failure. That is, HMaster is allowed to fail to sync 9 times in total. Once it exceeds, the following log should appear in the HMaster log. + + Sync slots after log roll failed, abort. ++ +.Default +`3` + [[regionserver.arch]] == RegionServer