Scala/Python users can add files to Spark job by submit options --files or SparkContext.addFile(). Meanwhile, users can get the added file by SparkFiles.get(filename).
We should also support this function for SparkR users, since they also have the requirements for some shared dependency files. For example, SparkR users can download third party R packages to driver firstly, add these files to the Spark job as dependency by this API and then each executor can install these packages by install.packages.
- is related to
SPARK-17428 SparkR executors/workers support virtualenv
- links to