Details
-
Bug
-
Status: Resolved
-
Critical
-
Resolution: Fixed
-
1.6.1
-
None
-
OSX El Capitan (java "1.8.0_65"), Oracle Linux 6 (java 1.8.0_92-b14)
Description
Spark streaming documentation recommends application developers create static connection pools. To clean up this pool, we add a shutdown hook.
The problem is that in spark 1.6.1, the shutdown hook for an executor will be called only for the first submitted job. (on the second and subsequent job submissions, the shutdown hook for the executor will NOT be invoked)
problem not seen when using java 1.7
problem not seen when using spark 1.6.0
looks like this bug is caused by this modification from 1.6.0 to 1.6.1:
https://issues.apache.org/jira/browse/SPARK-12486
steps to reproduce the problem :
1.) install spark 1.6.1
2.) submit this basic spark application
import org.apache.spark.
{ SparkContext, SparkConf }object MyPool {
def printToFile( f : java.io.File )( op : java.io.PrintWriter => Unit ) {
val p = new java.io.PrintWriter(f)
try
finally
{ p.close() } }
def myfunc( ) =
def createEvidence( ) = {
printToFile(new java.io.File("/var/tmp/evidence.txt"))
}
sys.addShutdownHook
}
object BasicSpark {
def main( args : Array[String] ) = {
val sparkConf = new SparkConf().setAppName("BasicPi")
val sc = new SparkContext(sparkConf)
sc.parallelize(1 to 2).foreach
sc.stop()
}
}
3.) you will see that /var/tmp/evidence.txt is created
4.) now delete this file
5.) submit a second job
6.) you will see that /var/tmp/evidence.txt is no longer created on the second submission
7.) if you use java 7 or spark 1.6.0, the evidence file will be created on the second and subsequent submits