Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-17559

PeriodicGraphCheckpointer did not persist edges as expected in some cases

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • None
    • 2.0.2, 2.1.0
    • MLlib
    • None

    Description

      When use PeriodicGraphCheckpointer to persist graph, sometimes the edge isn't persisted. As currently only when vertices's storage level is none, graph is persisted. However there is a chance vertices's storage level is not none while edges's is none. Eg. graph created by a outerJoinVertices operation, vertices is automatically cached while edges is not. In this way, edges will not be persisted if we use PeriodicGraphCheckpointer do persist.

      See below minimum example:
      val graphCheckpointer = new PeriodicGraphCheckpointer[Array[String], Int](2, sc)
      val users = sc.textFile("data/graphx/users.txt")
      .map(line => line.split(",")).map(parts => (parts.head.toLong, parts.tail))
      val followerGraph = GraphLoader.edgeListFile(sc, "data/graphx/followers.txt")

      val graph = followerGraph.outerJoinVertices(users)

      { case (uid, deg, Some(attrList)) => attrList case (uid, deg, None) => Array.empty[String] }

      graphCheckpointer.update(graph)

      Attachments

        Activity

          People

            dingding dingding
            ding ding
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - 1h
                1h
                Remaining:
                Remaining Estimate - 1h
                1h
                Logged:
                Time Spent - Not Specified
                Not Specified