Details
-
Bug
-
Status: Closed
-
Minor
-
Resolution: Not A Problem
-
1.2.0
-
None
-
None
Description
Hello folks,
Consider the following simple app for word counting via network socket:
WordCount.scala
val conf = new SparkConf().setAppName("Sample Application") val sc = new SparkContext(conf) val ssc = new StreamingContext(sc, Seconds(5)) ssc.checkpoint("target/checkpointDir") val lines = ssc.socketTextStream("localhost", 9999) val words = lines.flatMap(_.split(" ")) val pairs = words.map(word => (word, 1)) val wordCounts = pairs.reduceByKey(_ + _) wordCounts.saveAsHadoopFiles("target/prefix","suffix") //nc -lk 9999 ssc.start() ssc.awaitTermination(60)
When this is packaged and executed on spark, following exception is thrown:
java.io.NotSerializableException: org.apache.hadoop.mapred.JobConf
JobConf usage inside saveAsHadoopFiles methods seems to be the cause.
Thanks