Affects Version/s: 2.3.0
Fix Version/s: None
Component/s: Spark Submit
When submitting a spark application in --deploy-mode cluster + spark standalone cluster, environment variables from the client machine overwrite server environment variables.
We use SPARK_DIST_CLASSPATH environment variable to add extra required dependencies to the application. We observed that client machine SPARK_DIST_CLASSPATH overwrite remote server machine value, resulting in application submission failure.
We have inspected the code and found:
1. In org.apache.spark.deploy.Client line 86:
2. In org.apache.spark.launcher.WorkerCommandBuilder line 35:
Seen in line 35 is that the environment is overwritten in the server machine but in line 36 the SPARK_HOME is restored to the server value.
We think the bug can be fixed by adding a line that restores SPARK_DIST_CLASSPATH to its server value, similar to SPARK_HOME