Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-2527

incorrect persistence level shown in Spark UI after repersisting

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Cannot Reproduce
    • Affects Version/s: 1.0.0
    • Fix Version/s: 1.2.0
    • Component/s: Web UI
    • Labels:
      None

      Description

      If I persist an RDD at one level, unpersist it, then repersist it at another level, the UI will continue to show the RDD at the first level...but correctly show individual partitions at the second level.

      import org.apache.spark.api.java.StorageLevels
      import org.apache.spark.api.java.StorageLevels._
      val test1 = sc.parallelize(Array(1,2,3))test1.persist(StorageLevels.DISK_ONLY)
      test1.count()
      test1.unpersist()
      test1.persist(StorageLevels.MEMORY_ONLY)
      test1.count()
      

      after the first call to persist and count, the Spark App web UI shows:

      RDD Storage Info for 14 Storage Level: Disk Serialized 1x Replicated
      rdd_14_0 Disk Serialized 1x Replicated

      After the second call, it shows:

      RDD Storage Info for 14 Storage Level: Disk Serialized 1x Replicated
      rdd_14_0 Memory Deserialized 1x Replicated

        Attachments

        1. persistbug2.png
          33 kB
          Diana Carroll
        2. persistbug1.png
          32 kB
          Diana Carroll

          Activity

            People

            • Assignee:
              joshrosen Josh Rosen
              Reporter:
              dcarroll@cloudera.com Diana Carroll
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: