diff --git hadoop-project/src/site/site.xml hadoop-project/src/site/site.xml index e8c037b..451330e 100644 --- hadoop-project/src/site/site.xml +++ hadoop-project/src/site/site.xml @@ -99,6 +99,7 @@ + diff --git hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/ResourceManagerHA.apt.vm hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/ResourceManagerHA.apt.vm new file mode 100644 index 0000000..cb2260d --- /dev/null +++ hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/ResourceManagerHA.apt.vm @@ -0,0 +1,225 @@ +~~ Licensed under the Apache License, Version 2.0 (the "License"); +~~ you may not use this file except in compliance with the License. +~~ You may obtain a copy of the License at +~~ +~~ http://www.apache.org/licenses/LICENSE-2.0 +~~ +~~ Unless required by applicable law or agreed to in writing, software +~~ distributed under the License is distributed on an "AS IS" BASIS, +~~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +~~ See the License for the specific language governing permissions and +~~ limitations under the License. See accompanying LICENSE file. + + --- + ResourceManager High Availability + --- + --- + ${maven.build.timestamp} + +ResourceManager High Availability + + \[ {{{./index.html}Go Back}} \] + +%{toc|section=1|fromDepth=0} + +* Introduction + + This guide provides an overview of the YARN ResourceManager High Availability, + and details on how to configure and use this feature. The ResourceManager (RM) + is responsible for tracking the resources in a cluster, and scheduling + applications (e.g., MapReduce jobs). Prior to Hadoop 2.4, the ResourceManager + is the single point of failure in a YARN cluster. The High Availability + feature adds redundancy in the form of an Active/Standby ResoureManager pair + to remove this otherwise single point of failure. Furthermore, upon failover + from the Standby ResourceManager to the Active, the applications can resume + from their last check-pointed state, using + the feature of {{{./ResourceManagerRestart.html}ResourceManger Restart}}. + This allows handling (1) unplanned events like machine crashes, + and (2) planned maintenance events such as + software or hardware upgrades on the machine running the ResourceManager + without any significant performance effect to running applications. + +* Architecture + +** RM Failover + + ResourceManager HA is realized through + Active/Standby architecture - one RM is Active, and one or more RMs are in + Standby mode waiting to take over should anything happen to the Active. + The stimulus to transition-to-active comes from either the admin (through CLI) + or through the integrated failover controller when automatic failover is enabled. + +*** Manual transitions and failover + + When automatic failover is not enabled, admins have to manually transition + one of the RMs to Active. To failover from one RM to the other, they are + expected to first transition the Active-RM to Standby and transition a + Standby-RM to Active. All this can be done using the "<<>>" CLI. + +*** Automatic failover + + The RMs embed the Zookeeper-based ActiveStandbyElector to decide which RM + should be the Active. When the Active goes down or becomes unresponsive, + another RM is automatically elected to be the Active and takes over. Note + that, there is no need to run a separate ZKFC daemon as is the case for + HDFS. + +*** Client, ApplicationMaster and NodeManager failover + + When there are multiple RMs, the yarn-configuration (yarn-site.xml) used by + clients and nodes is expected to list all the RMs. Clients, + ApplicationMasters (AMs) and NodeManagers (NMs) try connecting to the RMs in + a round-robin fashion until they hit the Active RM. If the Active goes down, + they resume the round-robin polling until they hit the "new" Active. + You can override this logic by + implementing <<>> and + setting the value of <<>> to + the class name. + +** Taking over the state + + With the {{{./ResourceManagerRestart.html}ResourceManger Restart}} enabled, + the RM promoting to active loads the internal state and + continues to operate as if fail-over never actually went down. The scheduler + reconstructs its state from node heartbeats. A new attempt is spawned for + each managed application previously submitted to the RM. Applications can + checkpoint periodically to avoid losing any work. + The state store must be visible from the both of Active/Standby RMs. + The <<>> implicitly allows write access to a + single RM at any point in time, and hence is the recommended store to use in + an HA cluster. When using the ZKRMStateStore, there is no need for a separate + fencing mechanism to address a potential split-brain situation where multiple + RMs assume the Active role. + +* Deployment + +** Configurations + + All the above features are controlled by configuration knobs. Here is a list + of required/important ones. yarn-default.xml carries a full-list of knobs. + See {{{../hadoop-yarn-common/yarn-default.xml}yarn-default.xml}} + for more information including default values. + See also {{{./ResourceManagerRestart.html}the document of ResourceManger Restart}} + for setting up state store. + +*-------------------------+----------------------------------------------+ +|| Configuration Property || Description | +*-------------------------+----------------------------------------------+ +| yarn.resourcemanager.zk-address | | +| | Address of the ZK-quorum. +| | Used both for the store and embedded leader election. +*-------------------------+----------------------------------------------+ +| yarn.resourcemanager.ha.enabled | | +| | Enable RM HA +*-------------------------+----------------------------------------------+ +| yarn.resourcemanager.ha.rm-ids | | +| | List of logical IDs for the RMs. | +| | e.g., "rm1,rm2" | +*-------------------------+----------------------------------------------+ +| yarn.resourcemanager.hostname. | | +| | For each , specify the hostname the | +| | RM corresponds to. Alternately, one could set each of the RPC addresses. | +*-------------------------+----------------------------------------------+ +| yarn.resourcemanager.ha.id | | +| | Identifies the RM in the ensemble. This is optional; | +| | however, if being set, ensure the two RMs have their own IDs in the config | +*-------------------------+----------------------------------------------+ +| yarn.resourcemanager.ha.automatic-failover.enabled | | +| | Enable automatic failover; | +| | By default, it is enabled only when HA is enabled. | +*-------------------------+----------------------------------------------+ +| yarn.resourcemanager.ha.automatic-failover.embedded | | +| | Use embedded leader-elector | +| | to pick the Active RM, when automatic failover is enabled. By default, | +| | it is enabled only when HA is enabled. | +*-------------------------+----------------------------------------------+ +| yarn.resourcemanager.cluster-id | | +| | Identifies the cluster. Used by the elector to | +| | ensure an RM doesn't take over as Active for another cluster. | +*-------------------------+----------------------------------------------+ +| yarn.client.failover-proxy-provider | | +| | The class to be used by Clients, AMs and NMs to failover to the Active RM. | +*-------------------------+----------------------------------------------+ +| yarn.client.failover-max-attempts | | +| | The max number of times FailoverProxyProvider should attempt failover. | +*-------------------------+----------------------------------------------+ +| yarn.client.failover-sleep-base-ms | | +| | The sleep base (in milliseconds) to be used for calculating | +| | the exponential delay between failovers. | +*-------------------------+----------------------------------------------+ +| yarn.client.failover-sleep-max-ms | | +| | The maximum sleep time (in milliseconds) between failovers | +*-------------------------+----------------------------------------------+ +| yarn.client.failover-retries | | +| | The number of retries per attempt to connect to a ResourceManager. | +*-------------------------+----------------------------------------------+ +| yarn.client.failover-retries-on-socket-timeouts | | +| | The number of retries per attempt to connect to a ResourceManager on socket timeouts. | +*-------------------------+----------------------------------------------+ + +*** Sample configurations + + Here is the sample of minimal setup for RM failover. + ++---+ + + yarn.resourcemanager.ha.enabled + true + + + yarn.resourcemanager.cluster-id + cluster1 + + + yarn.resourcemanager.ha.rm-ids + rm1,rm2 + + + yarn.resourcemanager.hostname.rm1 + master1 + + + yarn.resourcemanager.hostname.rm2 + master2 + + + yarn.resourcemanager.zk-address + zk1:2181,zk2:2181,zk3:2181 + ++---+ + +** Admin commands + + <<>> has a few HA-specific command options to check the health/state of an + RM, and transition to Active/Standby. + Commands for HA take service id of RM set by <<>> + as argument. + ++---+ + $ yarn rmadmin -getServiceState rm1 + active + + $ yarn rmadmin -getServiceState rm2 + standby ++---+ + + If automatic failover is enabled, you can not use manual transition command. + ++---+ + $ yarn rmadmin -transitionToStandby rm1 + Automatic failover is enabled for org.apache.hadoop.yarn.client.RMHAServiceTarget@1d8299fd + Refusing to manually manage HA state, since it may cause + a split-brain scenario or other incorrect state. + If you are very sure you know what you are doing, please + specify the forcemanual flag. ++---+ + + See {{{./YarnCommands.html}YarnCommands}} for more details. + +** Web UI + + The Standby automatically redirects to the Active, except for the "About" page. + +** Web Services + + The web services automatically redirect to the Active.