Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-39356

Add option to skip initial message in Pregel API

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • 3.2.1
    • None
    • GraphX

    Description

      The current (3.2.1) Pregel API takes a parameter initialMsg: A where A : scala.reflect.ClassTag is the message type for the Pregel iterations. At the start of the iterative process, the user-supplied vertex update method vprog is called with the initial message.

      However, in some cases, the start point for a message passing scheme is best described by starting with the message phase rather than the vprog phase, and in many cases the first message depends on individual vertex data (instead of a static initial message). In these cases, users are forced to add boilerplate to their vprog function to check if the message received is the initialMessage and ignore the message (leave the node state unchanged) if it is. This leads to less efficient (due to extra iteration and check) and less readable code.
       
      My proposed solution is to change initialMsg to a parameter of type Option[A] with default None, and then inside Pregel.apply function, set:

      var g = initialMsg match {
        case Some(msg) => graph.mapVertices((vid, vdata) => vprog(vid, vdata, msg))
        case _ => graph
      }
      

      This way, the user chooses whether to start the iteration from the message or vprog phase. I believe this small change could improve user code readability and efficiency.

      Note: The signature of GraphOps.pregel would have to be changed to match
       

       

      Attachments

        Activity

          People

            Unassigned Unassigned
            aaronzo Aaron Zolnai-Lucas
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: