Description
If I persist an RDD at one level, unpersist it, then repersist it at another level, the UI will continue to show the RDD at the first level...but correctly show individual partitions at the second level.
import org.apache.spark.api.java.StorageLevels import org.apache.spark.api.java.StorageLevels._ val test1 = sc.parallelize(Array(1,2,3))test1.persist(StorageLevels.DISK_ONLY) test1.count() test1.unpersist() test1.persist(StorageLevels.MEMORY_ONLY) test1.count()
after the first call to persist and count, the Spark App web UI shows:
RDD Storage Info for 14 Storage Level: Disk Serialized 1x Replicated
rdd_14_0 Disk Serialized 1x Replicated
After the second call, it shows:
RDD Storage Info for 14 Storage Level: Disk Serialized 1x Replicated
rdd_14_0 Memory Deserialized 1x Replicated