Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-1015

Visualize the DAG of RDD

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Resolved
    • Major
    • Resolution: Won't Fix
    • 0.9.0
    • None
    • Spark Core
    • None

    Description

      The DAG of RDD can help user understand the data flow and how spark get the final RDD executed. It could help user to find chances to optimize the execution of some complex RDD. I will leverage graphviz to visualize the DAG.

      For this task, I plan to split it into 2 steps.

      Step 1. Just visualize the simple DAG graph. Each RDD is one node, and there will be one edge between the parent RDD and child RDD. ( I attach one simple graph in the attachments )

      Step 2. Put RDD in the same stage into one sub graph. This may need to extract the splitting staging related code in DAGSchduler.

      Attachments

        Activity

          People

            Unassigned Unassigned
            zjffdu Jeff Zhang
            Votes:
            0 Vote for this issue
            Watchers:
            12 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: