The Docker-on-Yarn feature is stable for a while now in Hadoop.
One can run Spark on Docker using the Docker-on-Yarn feature by providing runtime environments to the Spark AM and Executor containers similar to this:
This is not very user friendly. I suggest to add CLI options to specify:
- whether docker image should be used (--docker)
- which docker image should be used (--docker-image)
- what docker mounts should be used (--docker-mounts)
for the AM and executor containers separately.