Hi Konstantin Boudnik, so happy to see you proposed this.
If you'd like to let user run a one node cluster, then docker -h might be the way out.
If you'd like to let user run a multi-node cluster, docker-compose would be better.
Speaking of this, I've been thinking about how to speed up the cluster deployment for a long time since the day I got asked on the Apache Big Data 2015.
And here's what I have right now. The following design let user to specify components and burn images with big data stack they'd like to have:
- Define a set of component that user would like to be installed in the image. The set can be defined in config.yaml
- Specify that burned image name, in_memory_stack, in config.yaml as well
- Burn! Expose something like the following to the users:
./docker-hadoop -C config.yaml --burn
# Or wrapped in gradle
./gradlew -Pconfig=config.yaml burn-docker-image
- What burns do is to simply do yum or apt install of those pre-defined component.
- Run a multi-node cluster as usual will components pre-installed image and, dada, you get a cluster instantly.
How does this sound? If I didn't make it clear. Please point it out and I'll try to describe more.