Details
-
New Feature
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
3.4.0
-
Reviewed
Description
When using YARN docker support, although the hadoop shell supported
-docker_client_config
to pass the client config file that contains security token to generate the docker config for each job as a temporary file.
For other applications that submit jobs to YARN, e.g. Spark, which loads the docker setting via system environment e.g.
spark.executorEnv.*
will not be able to add those authorization token because this system environment isn't considered in YARN.
Add genetic solution to handle these kind of cases without making changes in spark code or others
Eg
When using remote container registry, the YARN_CONTAINER_RUNTIME_DOCKER_CLIENT_CONFIG must reference the config.json
file containing the credentials used to authenticate.
DOCKER_IMAGE_NAME=hadoop-docker
DOCKER_CLIENT_CONFIG=hdfs:///user/hadoop/config.json
spark-submit --master yarn \
--deploy-mode cluster \
--conf spark.executorEnv.YARN_CONTAINER_RUNTIME_TYPE=docker \
--conf spark.executorEnv.YARN_CONTAINER_RUNTIME_DOCKER_IMAGE=$DOCKER_IMAGE_NAME \
--conf spark.executorEnv.YARN_CONTAINER_RUNTIME_DOCKER_CLIENT_CONFIG=$DOCKER_CLIENT_CONFIG \
--conf spark.yarn.appMasterEnv.YARN_CONTAINER_RUNTIME_TYPE=docker \
--conf spark.yarn.appMasterEnv.YARN_CONTAINER_RUNTIME_DOCKER_IMAGE=$DOCKER_IMAGE_NAME \
--conf spark.yarn.appMasterEnv.YARN_CONTAINER_RUNTIME_DOCKER_CLIENT_CONFIG=$DOCKER_CLIENT_CONFIG \
sparkR.R
Attachments
Issue Links
- links to