Details
-
Sub-task
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
Description
Seeing sporadic failures during test setup. Specifically, when spark-submit runs this error (or a similar error) gets thrown:
2018-05-15T10:55:02,112 INFO [RemoteDriver-stderr-redir-27d3dcfb-2a10-4118-9fae-c200d2e095a5 main] client.SparkSubmitSparkClient: Exception in thread "main" java.io.FileNotFoundException: File file:/tmp/spark-56e217f7-b8a5-4c63-9a6b-d737a64f2820/__spark_libs__7371510645900072447.zip does not exist
2018-05-15T10:55:02,113 INFO [RemoteDriver-stderr-redir-27d3dcfb-2a10-4118-9fae-c200d2e095a5 main] client.SparkSubmitSparkClient: at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:641)
2018-05-15T10:55:02,113 INFO [RemoteDriver-stderr-redir-27d3dcfb-2a10-4118-9fae-c200d2e095a5 main] client.SparkSubmitSparkClient: at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:867)
2018-05-15T10:55:02,113 INFO [RemoteDriver-stderr-redir-27d3dcfb-2a10-4118-9fae-c200d2e095a5 main] client.SparkSubmitSparkClient: at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:631)
2018-05-15T10:55:02,113 INFO [RemoteDriver-stderr-redir-27d3dcfb-2a10-4118-9fae-c200d2e095a5 main] client.SparkSubmitSparkClient: at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:442)
2018-05-15T10:55:02,113 INFO [RemoteDriver-stderr-redir-27d3dcfb-2a10-4118-9fae-c200d2e095a5 main] client.SparkSubmitSparkClient: at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:365)
2018-05-15T10:55:02,113 INFO [RemoteDriver-stderr-redir-27d3dcfb-2a10-4118-9fae-c200d2e095a5 main] client.SparkSubmitSparkClient: at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:316)
2018-05-15T10:55:02,113 INFO [RemoteDriver-stderr-redir-27d3dcfb-2a10-4118-9fae-c200d2e095a5 main] client.SparkSubmitSparkClient: at org.apache.spark.deploy.yarn.Client.copyFileToRemote(Client.scala:356)
2018-05-15T10:55:02,113 INFO [RemoteDriver-stderr-redir-27d3dcfb-2a10-4118-9fae-c200d2e095a5 main] client.SparkSubmitSparkClient: at org.apache.spark.deploy.yarn.Client.org$apache$spark$deploy$yarn$Client$$distribute$1(Client.scala:478)
2018-05-15T10:55:02,113 INFO [RemoteDriver-stderr-redir-27d3dcfb-2a10-4118-9fae-c200d2e095a5 main] client.SparkSubmitSparkClient: at org.apache.spark.deploy.yarn.Client.prepareLocalResources(Client.scala:565)
2018-05-15T10:55:02,113 INFO [RemoteDriver-stderr-redir-27d3dcfb-2a10-4118-9fae-c200d2e095a5 main] client.SparkSubmitSparkClient: at org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:863)
2018-05-15T10:55:02,113 INFO [RemoteDriver-stderr-redir-27d3dcfb-2a10-4118-9fae-c200d2e095a5 main] client.SparkSubmitSparkClient: at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:169)
2018-05-15T10:55:02,113 INFO [RemoteDriver-stderr-redir-27d3dcfb-2a10-4118-9fae-c200d2e095a5 main] client.SparkSubmitSparkClient: at org.apache.spark.deploy.yarn.Client.run(Client.scala:1146)
2018-05-15T10:55:02,113 INFO [RemoteDriver-stderr-redir-27d3dcfb-2a10-4118-9fae-c200d2e095a5 main] client.SparkSubmitSparkClient: at org.apache.spark.deploy.yarn.YarnClusterApplication.start(Client.scala:1518)
2018-05-15T10:55:02,113 INFO [RemoteDriver-stderr-redir-27d3dcfb-2a10-4118-9fae-c200d2e095a5 main] client.SparkSubmitSparkClient: at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:879)
2018-05-15T10:55:02,113 INFO [RemoteDriver-stderr-redir-27d3dcfb-2a10-4118-9fae-c200d2e095a5 main] client.SparkSubmitSparkClient: at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:197)
2018-05-15T10:55:02,113 INFO [RemoteDriver-stderr-redir-27d3dcfb-2a10-4118-9fae-c200d2e095a5 main] client.SparkSubmitSparkClient: at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:227)
2018-05-15T10:55:02,113 INFO [RemoteDriver-stderr-redir-27d3dcfb-2a10-4118-9fae-c200d2e095a5 main] client.SparkSubmitSparkClient: at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:136)
2018-05-15T10:55:02,113 INFO [RemoteDriver-stderr-redir-27d3dcfb-2a10-4118-9fae-c200d2e095a5 main] client.SparkSubmitSparkClient: at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Essentially, Spark is writing some files for container localization to a tmp dir, and that tmp dir is getting deleted. We have seen a lot of issues with writing files to /tmp/ in the past, so its probably best to write these files to a test-specific dir.