Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-12655

GraphX does not unpersist RDDs

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 1.5.2, 1.6.0
    • 1.6.2, 2.0.0
    • GraphX
    • None

    Description

      Looks like Graph does not clean all RDDs from the cache on unpersist

      // open spark-shell 1.5.2 or 1.6.0
      // run
      
      import org.apache.spark.graphx._
      
      val vert = sc.parallelize(List((1L, 1), (2L, 2), (3L, 3)), 1)
      val edges = sc.parallelize(List(Edge[Long](1L, 2L), Edge[Long](1L, 3L)), 1)
      
      val g0 = Graph(vert, edges)
      val g = g0.partitionBy(PartitionStrategy.EdgePartition2D, 2)
      val cc = g.connectedComponents()
      
      cc.unpersist()
      g.unpersist()
      g0.unpersist()
      vert.unpersist()
      edges.unpersist()
      

      open http://localhost:4040/storage/
      Spark UI 4040 Storage page still shows 2 items

      VertexRDD      Memory Deserialized 1x Replicated   1  100%    1688.0 B   0.0 B  0.0 B
      EdgeRDD        Memory Deserialized 1x Replicated   2  100%      4.7 KB   0.0 B  0.0 B
      

      Attachments

        Activity

          People

            jason412 Jason C Lee
            apivovarov Alexander Pivovarov
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: