Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-23394

Storage info's Cached Partitions doesn't consider the replications (but sc.getRDDStorageInfo does)

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.3.0
    • Fix Version/s: 2.3.0, 2.4.0
    • Component/s: Spark Core
    • Labels:
      None

      Description

      Start spark as:

      $ bin/spark-shell --master local-cluster[2,1,1024]
      
      scala> import org.apache.spark.storage.StorageLevel._
      import org.apache.spark.storage.StorageLevel._
      
      scala> sc.parallelize((1 to 100), 10).persist(MEMORY_AND_DISK_2).count
      res0: Long = 100                                                                
      
      scala> sc.getRDDStorageInfo(0).numCachedPartitions
      res1: Int = 20
      

      Cached Partitions

      On the UI at the Storage tab Cached Partitions is 10:

      .

      Full tab

      Moreover the replicated partitions was also listed on the old 2.2.1 like:

      But now it is like:

        Attachments

        1. Storage_Tab.png
          29 kB
          Attila Zsolt Piros
        2. Spark_2.4.0-SNAPSHOT.png
          182 kB
          Attila Zsolt Piros
        3. Spark_2.2.1.png
          286 kB
          Attila Zsolt Piros

          Activity

            People

            • Assignee:
              attilapiros Attila Zsolt Piros
              Reporter:
              attilapiros Attila Zsolt Piros
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: