Details
-
Bug
-
Status: Resolved
-
Minor
-
Resolution: Fixed
-
0.7.0, 0.7.1, 0.7.2, 0.7.3
-
None
Description
currently Spark storage ui aggregate RDDInfo using block name, and in block manger, all the block name is rdd__. But in Spark Streaming, block name changes to input--, this will cause a exception when group rdd info using block name in StorageUtils.scala:
val groupedRddBlocks = infos.groupBy
{ case(k, v) => k.substring(0,k.lastIndexOf('_')) }.mapValues(_.values.toArray)
according to '_' to get rdd name will meet exception when using Spark Streaming.
java.lang.StringIndexOutOfBoundsException: String index out of range: -1
at java.lang.String.substring(String.java:1958)
at spark.storage.StorageUtils$$anonfun$3.apply(StorageUtils.scala:49)
at spark.storage.StorageUtils$$anonfun$3.apply(StorageUtils.scala:48)
at scala.collection.TraversableLike$$anonfun$groupBy$1.apply(TraversableLike.scala:315)
at scala.collection.TraversableLike$$anonfun$groupBy$1.apply(TraversableLike.scala:314)
at scala.collection.immutable.HashMap$HashMap1.foreach(HashMap.scala:178)
at scala.collection.immutable.HashMap$HashTrieMap.foreach(HashMap.scala:347)
at scala.collection.TraversableLike$class.groupBy(TraversableLike.scala:314)
at scala.collection.immutable.HashMap.groupBy(HashMap.scala:38)
at spark.storage.StorageUtils$.rddInfoFromBlockStatusList(StorageUtils.scala:48)
at spark.storage.StorageUtils$.rddInfoFromStorageStatus(StorageUtils.scala:40)
at spark.storage.BlockManagerUI$$anonfun$5.apply(BlockManagerUI.scala:54)
....
there has two methods:
1. filter out all the Spark Streaming's input block RDD.
2. treat Spark Streaming's input RDD as a special case, add code to support this case.