Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-999

Report More Instrumentation for Task Execution Time in UI

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 0.9.0
    • None
    • None

    Description

      We should report finer-grained information about task execution time inside of the Spark UI. Here is a proposal of exactly what we should report:

      Task execution goes through a few stages on the Executor.
      1. Deserializing the task
      2. Executing the task. This pipelines a few things:
      --> Reading shuffle input
      --> Running whatever function on the RDD
      --> Writing shuffle output
      3. Serializing the result

      I'd propose we should report the following five timing metrics. Man of these are already tracked in TaskMetrics.

      • Time spent deserializing the task on the executor (executorDeserializeTime)
      • Total execution time for the task (executorRunTime)
        • Time spent blocking on shuffle reads during the task (fetchWaitTime)
        • Time spent blocking on shuffle writes during the task (shuffleWriteTime)
      • Time spent serializing the result (not currently tracked)

      Reporting all of these in the Stage UI table would be great. Bonus points if you can find some better way to visualize them.

      Note that the time spent serializing the result is currently not tracked. We should figure out if we can do this in a simple way - it seems like you could modify TaskResult to contain an already serialized buffer instead of the result itself. Then you could first serialize that result, update the TaskMetrics and then serialize them (we wouldn't track the time to serialize the metrics themselves). If this is too much performance overhead we could also write a custom serializer for the broader result struct (containing the accumulators, metrics, and result).

      One other missing thing here is the ability to track various metrics if the task is reading or writing from HDFS or doing some other expensive thing within it's own execution. It would be nice to add support for counters and such in there, but we can keep that outside of the scope of this JIRA.

      Attachments

        Activity

          People

            Unassigned Unassigned
            pwendell Patrick Wendell
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: