diff --git hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/YarnApplicationSecurity.md hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/YarnApplicationSecurity.md index bab46b9384b..30b2252b9b5 100644 --- hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/YarnApplicationSecurity.md +++ hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/YarnApplicationSecurity.md @@ -402,27 +402,67 @@ connection with the AM and pass up the current user's credentials). ## Securing YARN Application Web UIs and REST APIs -YARN provides a straightforward way of giving every YARN application SPNEGO authenticated -web pages: it implements SPNEGO authentication in the Resource Manager Proxy. - -YARN web UI are expected to load the AM proxy filter when setting up its web UI; this filter -will redirect all HTTP Requests coming from any host other than the RM Proxy hosts to an -RM proxy, to which the client app/browser must re-issue the request. The client will authenticate -against the principal of the RM Proxy (usually `yarn`), and, once authenticated, have its -request forwared. - -As a result, all client interactions are SPNEGO-authenticated, without the YARN application -itself needing any kerberos principal for the clients to authenticate against. - -Known weaknesses in this approach are: - -1. As calls coming from the proxy hosts are not redirected, any application running -on those hosts has unrestricted access to the YARN applications. This is why in a secure cluster -the proxy hosts *must* run on cluster nodes which do not run end user code (i.e. not run YARN -NodeManagers and hence schedule YARN containers, nor support logins by end users). - -1. The HTTP requests between proxy and YARN RM Server are not currently encrypted. -That is: HTTPS is not supported. +YARN provides a straightforward way of giving every YARN Application SPNEGO +authenticated web pages: the RM implements SPNEGO authentication in the Resource +Manager Proxy and restricts access to the Yarn Application's Web UI to only the +RM Proxy. There are two ways to do this: + +#### Option 1: AM IP Proxy Filter + +A YARN Application's Web Server should load the AM proxy filter (see the +`AmFilterInitializer` class) when setting up its web UI; this filter will +redirect all HTTP Requests coming from any host other than the RM Proxy hosts to +an RM proxy, to which the client app/browser must re-issue the request. The +client will authenticate against the principal of the RM Proxy (usually `yarn`), +and, once authenticated, have its request forwarded. + +Known weaknesses in this option are: + +1. The AM proxy filter only checks for the IP/hosts of the RM Proxy so any +Application running on those hosts has unrestricted access to the YARN +Application's Web UI. This is why in a secure cluster the proxy hosts *must* run +on cluster nodes which do not run end user code (i.e. not running YARN +NodeManagers, and hence not schedule YARN containers; nor support logins by end +users). + +1. The HTTP requests between RM proxy and the Yarn Application are not currently +encrypted. That is: HTTPS is not supported. + +#### Option 2: HTTPS Mutual Authentication + +By default, YARN Application Web UIs are not encrypted (i.e. HTTPS). It is up to +the Application to provide support for HTTPS. This can either be done entirely +independently with a valid HTTPS Certificate from a public CA or source that the +RM or JVM is configured to trust. Or, alternatively, the RM can act as a +limited CA and provide the Application with a Certificate it can use, which is +only accepted by the RM proxy, and no other clients (e.g. web browsers). This is +important because the Application cannot necessarily be trusted to not steal any +issued Certificates or perform other malicious behavior. The Certificates the RM +issues will be (a) expired, (b) have a Subject containing `CN=` +instead of the typical `CN=`, and (c) be issued by a +self-signed CA Certificate generated by the RM. + +For an Application to take advantage of this ability, it simply needs to load +the provided Keystore into its Web Server of choice. The location of the +Keystore can be found in the `KEYSTORE_FILE_LOCATION` environment variable, and +its password in the `KEYSTORE_PASSWORD` environment variable. This will be +available as long as `yarn.resourcemanager.application-https.policy` is *not* +set to `NONE` (see table below), and it's provided an HTTPS Tracking URL. + +Additionally, the Application can verify that the RM proxy is in fact the RM via +HTTPS Mutual Authentication. Besides the provided Keystore, there is also a +provided Truststore with the RM proxy's client Certificate. By loading this +Truststore and enabling `needsClientAuth` (or equivalent) in its Web Server of +choice, the Web Server should automatically require that the client (i.e. the RM +proxy) provide a trusted Certificate, or it will fail the connection. This +ensures that only the RM Proxy, which the client authenticated against, can +access it. + +| `yarn.resourcemanager.application-https.policy` | Behavior | +|:---- |:---- | +| `OFF` | The RM will do nothing special.| +| `LENIENT` (default) | The RM will generate and provide a keystore and truststore whenever an AM gives it an HTTPS tracking URL. It will still accept HTTP URLs though. | +| `STRICT` | The RM will always generate and provide a keystore and truststore and require that the tracking URL for all applications is HTTPS. | ## Securing YARN Application REST APIs