I would like to propose a few changes to the spark_integration Dockerfile:
The size of the resulting image can be reduced by making the following changes:
- consolidating all RUN commands into a single RUN layer (reducing the number of layers)
- running apt-get clean to clear out the package cache
- running conda clean --all to clear out cached package tarballs, abandoned package versions, and other build artifacts from all the libraries that are conda installed
I will be submitting a PR on GitHub shortly. Generating this issue first so I can tag my PR to it.