Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Incomplete
-
1.6.0
-
None
-
Patch
Description
Spark application submitted via spark-submit fails because the resource localization for the application master instance 2 fails.
This is the scenario.
- I submitted an application via spark-submit
- The application master failed (The reason is not important here, but here it is: jar was compiled through java 8 and the java installed in cluster nodes was java 7)
- The appmaster instance 1 deleted the staging directory on failure. (Below is the log excerpt)
- Deleting staging directory .sparkStaging/application_1542037527280_0015
- The Yarn restarted the appmaster in another node, where the NM couldn't find the resource in HDFS resulting in to (from nodemanager logs)
- Diagnostics: File does not exist: hdfs://mpcdh001.informatica.com:8020/user/sampleuser/.sparkStaging/application_1542037527280_0007/_spark_conf_7096300976806305459.zip