Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-23394

Storage info's Cached Partitions doesn't consider the replications (but sc.getRDDStorageInfo does)

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.3.0
    • 2.3.0, 2.4.0
    • Spark Core
    • None

    Description

      Start spark as:

      $ bin/spark-shell --master local-cluster[2,1,1024]
      
      scala> import org.apache.spark.storage.StorageLevel._
      import org.apache.spark.storage.StorageLevel._
      
      scala> sc.parallelize((1 to 100), 10).persist(MEMORY_AND_DISK_2).count
      res0: Long = 100                                                                
      
      scala> sc.getRDDStorageInfo(0).numCachedPartitions
      res1: Int = 20
      

      Cached Partitions

      On the UI at the Storage tab Cached Partitions is 10:

      .

      Full tab

      Moreover the replicated partitions was also listed on the old 2.2.1 like:

      But now it is like:

      Attachments

        1. Storage_Tab.png
          29 kB
          Attila Zsolt Piros
        2. Spark_2.4.0-SNAPSHOT.png
          182 kB
          Attila Zsolt Piros
        3. Spark_2.2.1.png
          286 kB
          Attila Zsolt Piros

        Activity

          People

            attilapiros Attila Zsolt Piros
            attilapiros Attila Zsolt Piros
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: