Details
-
Bug
-
Status: Resolved
-
Minor
-
Resolution: Fixed
-
1.1.0
-
None
Description
When a shark job is done. there are some error message as following show in the log
INFO 22:10:41,635 SparkMaster: akka.tcp://sparkDriver@ip-172-31-11-204.us-west-1.compute.internal:57641 got disassociated, removing it. INFO 22:10:41,640 SparkMaster: Removing app app-20141106221014-0000 INFO 22:10:41,687 SparkMaster: Removing application Shark::ip-172-31-11-204.us-west-1.compute.internal INFO 22:10:41,710 SparkWorker: Asked to kill executor app-20141106221014-0000/0 INFO 22:10:41,712 SparkWorker: Runner thread for executor app-20141106221014-0000/0 interrupted INFO 22:10:41,714 SparkWorker: Killing process! ERROR 22:10:41,738 SparkWorker: Error writing stream to file /var/lib/spark/work/app-20141106221014-0000/0/stdout ERROR 22:10:41,739 SparkWorker: java.io.IOException: Stream closed ERROR 22:10:41,739 SparkWorker: at java.io.BufferedInputStream.getBufIfOpen(BufferedInputStream.java:162) ERROR 22:10:41,740 SparkWorker: at java.io.BufferedInputStream.read1(BufferedInputStream.java:272) ERROR 22:10:41,740 SparkWorker: at java.io.BufferedInputStream.read(BufferedInputStream.java:334) ERROR 22:10:41,740 SparkWorker: at java.io.FilterInputStream.read(FilterInputStream.java:107) ERROR 22:10:41,741 SparkWorker: at org.apache.spark.util.logging.FileAppender.appendStreamToFile(FileAppender.scala:70) ERROR 22:10:41,741 SparkWorker: at org.apache.spark.util.logging.FileAppender$$anon$1$$anonfun$run$1.apply$mcV$sp(FileAppender.scala:39) ERROR 22:10:41,741 SparkWorker: at org.apache.spark.util.logging.FileAppender$$anon$1$$anonfun$run$1.apply(FileAppender.scala:39) ERROR 22:10:41,742 SparkWorker: at org.apache.spark.util.logging.FileAppender$$anon$1$$anonfun$run$1.apply(FileAppender.scala:39) ERROR 22:10:41,742 SparkWorker: at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1311) ERROR 22:10:41,742 SparkWorker: at org.apache.spark.util.logging.FileAppender$$anon$1.run(FileAppender.scala:38) INFO 22:10:41,838 SparkMaster: Connected to Cassandra cluster: 4299 INFO 22:10:41,839 SparkMaster: Adding host 172.31.11.204 (Analytics) INFO 22:10:41,840 SparkMaster: New Cassandra host /172.31.11.204:9042 added INFO 22:10:41,841 SparkMaster: Adding host 172.31.11.204 (Analytics) INFO 22:10:41,842 SparkMaster: Adding host 172.31.11.204 (Analytics) INFO 22:10:41,852 SparkMaster: akka.tcp://sparkDriver@ip-172-31-11-204.us-west-1.compute.internal:57641 got disassociated, removing it. INFO 22:10:41,853 SparkMaster: akka.tcp://sparkDriver@ip-172-31-11-204.us-west-1.compute.internal:57641 got disassociated, removing it. INFO 22:10:41,853 SparkMaster: akka.tcp://sparkDriver@ip-172-31-11-204.us-west-1.compute.internal:57641 got disassociated, removing it. INFO 22:10:41,857 SparkMaster: akka.tcp://sparkDriver@ip-172-31-11-204.us-west-1.compute.internal:57641 got disassociated, removing it. INFO 22:10:41,862 SparkMaster: Adding host 172.31.11.204 (Analytics) WARN 22:10:42,200 SparkMaster: Got status update for unknown executor app-20141106221014-0000/0 INFO 22:10:42,211 SparkWorker: Executor app-20141106221014-0000/0 finished with state KILLED exitStatus 143
/var/lib/spark/work/app-20141106221014-0000/0/stdout is on the disk. It is trying to write to a close IO stream.
Spark worker shuts down by
private def killProcess(message: Option[String]) { var exitCode: Option[Int] = None logInfo("Killing process!") process.destroy() process.waitFor() if (stdoutAppender != null) { stdoutAppender.stop() } if (stderrAppender != null) { stderrAppender.stop() } if (process != null) { exitCode = Some(process.waitFor()) } worker ! ExecutorStateChanged(appId, execId, state, message, exitCode)
But stdoutAppender concurrently writes to output log file, which creates race condition.
Attachments
Attachments
Issue Links
- is related to
-
SPARK-9844 File appender race condition during SparkWorker shutdown
- Resolved
- relates to
-
SPARK-9844 File appender race condition during SparkWorker shutdown
- Resolved
- links to