Description
If I run the following on a YARN cluster
bin/spark-submit sheep.py --master yarn-client
it fails because of a mismatch in paths: `spark-submit` thinks that `sheep.py` resides on HDFS, and balks when it can't find the file there. A natural workaround is to add the `file:` prefix to the file:
bin/spark-submit file:/path/to/sheep.py --master yarn-client
However, this also fails. This time it is because python does not understand URI schemes.
This PR fixes this by automatically resolving all paths passed as command line argument to `spark-submit` properly. This has the added benefit of keeping file and jar paths consistent across different cluster modes.