Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-12655

GraphX does not unpersist RDDs

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 1.5.2, 1.6.0
    • Fix Version/s: 1.6.2, 2.0.0
    • Component/s: GraphX
    • Labels:
      None

      Description

      Looks like Graph does not clean all RDDs from the cache on unpersist

      // open spark-shell 1.5.2 or 1.6.0
      // run
      
      import org.apache.spark.graphx._
      
      val vert = sc.parallelize(List((1L, 1), (2L, 2), (3L, 3)), 1)
      val edges = sc.parallelize(List(Edge[Long](1L, 2L), Edge[Long](1L, 3L)), 1)
      
      val g0 = Graph(vert, edges)
      val g = g0.partitionBy(PartitionStrategy.EdgePartition2D, 2)
      val cc = g.connectedComponents()
      
      cc.unpersist()
      g.unpersist()
      g0.unpersist()
      vert.unpersist()
      edges.unpersist()
      

      open http://localhost:4040/storage/
      Spark UI 4040 Storage page still shows 2 items

      VertexRDD      Memory Deserialized 1x Replicated   1  100%    1688.0 B   0.0 B  0.0 B
      EdgeRDD        Memory Deserialized 1x Replicated   2  100%      4.7 KB   0.0 B  0.0 B
      

        Attachments

          Activity

            People

            • Assignee:
              jason412 Jason C Lee
              Reporter:
              apivovarov Alexander Pivovarov
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: