Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-2527

incorrect persistence level shown in Spark UI after repersisting

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Cannot Reproduce
    • 1.0.0
    • 1.2.0
    • Web UI
    • None

    Description

      If I persist an RDD at one level, unpersist it, then repersist it at another level, the UI will continue to show the RDD at the first level...but correctly show individual partitions at the second level.

      import org.apache.spark.api.java.StorageLevels
      import org.apache.spark.api.java.StorageLevels._
      val test1 = sc.parallelize(Array(1,2,3))test1.persist(StorageLevels.DISK_ONLY)
      test1.count()
      test1.unpersist()
      test1.persist(StorageLevels.MEMORY_ONLY)
      test1.count()
      

      after the first call to persist and count, the Spark App web UI shows:

      RDD Storage Info for 14 Storage Level: Disk Serialized 1x Replicated
      rdd_14_0 Disk Serialized 1x Replicated

      After the second call, it shows:

      RDD Storage Info for 14 Storage Level: Disk Serialized 1x Replicated
      rdd_14_0 Memory Deserialized 1x Replicated

      Attachments

        1. persistbug2.png
          33 kB
          Diana Carroll
        2. persistbug1.png
          32 kB
          Diana Carroll

        Activity

          People

            joshrosen Josh Rosen
            dcarroll@cloudera.com Diana Carroll
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: