Details
-
Improvement
-
Status: Resolved
-
Minor
-
Resolution: Fixed
-
1.3.1
-
None
-
YARN
Description
When specifying non-spark properties (i.e. names don't start with spark.) in the command line and config file, spark-submit and spark-shell behave differently, causing confusion to users.
Here is the summary-
spark-submit | spark-shell | |
---|---|---|
--conf k=v | silently ignored | applied |
spark-defaults.conf | show a warning message and ignored | show a warning message and ignored |
I assume that ignoring non-spark properties is intentional. If so, it should always be ignored with a warning message in all cases.
The reason why I bring this up is as follows. In my production Hadoop jobs, I set a couple of internal properties in the job config to keep track of extra information. In an attempt to do the same in Spark, I set a non-spark property in the command line and found that it doesn't work in spark-submit while it works in spark-shell. This was confusing to me at first, and I had to spend some time to fully understand all the different behaviors (as described above).