Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
0.9.0
-
None
-
Linux Ubuntu 14.04, a single spark node; standalone mode.
Description
conf/spark-defaults.conf
spark.akka.frameSize 5
spark.default.parallelism 1
scala> val collection = (1 to 1000000).map(i => ("foo" + i, i)).toVector collection: Vector[(String, Int)] = Vector((foo1,1), (foo2,2), (foo3,3), (foo4,4), (foo5,5), (foo6,6), (foo7,7), (foo8,8), (foo9,9), (foo10,10), (foo11,11), (foo12,12), (foo13,13), (foo14,14), (foo15,15), (foo16,16), (foo17,17), (foo18,18), (foo19,19), (foo20,20), (foo21,21), (foo22,22), (foo23,23), (foo24,24), (foo25,25), (foo26,26), (foo27,27), (foo28,28), (foo29,29), (foo30,30), (foo31,31), (foo32,32), (foo33,33), (foo34,34), (foo35,35), (foo36,36), (foo37,37), (foo38,38), (foo39,39), (foo40,40), (foo41,41), (foo42,42), (foo43,43), (foo44,44), (foo45,45), (foo46,46), (foo47,47), (foo48,48), (foo49,49), (foo50,50), (foo51,51), (foo52,52), (foo53,53), (foo54,54), (foo55,55), (foo56,56), (foo57,57), (foo58,58), (foo59,59), (foo60,60), (foo61,61), (foo62,62), (foo63,63), (foo64,64), (foo... scala> val rdd = sc.parallelize(collection) rdd: org.apache.spark.rdd.RDD[(String, Int)] = ParallelCollectionRDD[0] at parallelize at <console>:24 scala> rdd.first res4: (String, Int) = (foo1,1) scala> rdd.map(_._2).sum // nothing happens
CPU and I/O idle.
Memory usage reported by JVM, after manually triggered GC:
repl: 216 MB / 2 GB
executor: 67 MB / 2 GB
worker: 6 MB / 128 MB
master: 6 MB / 128 MB
No errors found in worker's stderr/stdout.
It works fine with 700,000 elements and then it takes about 1 second to process the request and calculate the sum. With 700,000 items the spark executor memory doesn't even exceed 300 MB out of 2GB available. It fails with 800,000 items.
Multiple parralelized collections of size 700,000 items at the same time in the same session work fine.
Attachments
Attachments
Issue Links
- is related to
-
SPARK-2156 When the size of serialized results for one partition is slightly smaller than 10MB (the default akka.frameSize), the execution blocks
-
- Resolved
-