Details
-
Improvement
-
Status: In Progress
-
Minor
-
Resolution: Unresolved
-
3.1.0
-
None
-
None
Description
This proposes to extend Spark instrumentation to add metrics aimed at understanding the performance of Python code called by Spark, via UDF, Pandas UDF or with MapPartittions. Relevant performance counters are exposed using the Spark Metrics System (based on the Dropwizard library). This allows to easily consume the metrics produced by executors, for example using a performance dashboard. See also the attached screenshot.
Attachments
Attachments
Issue Links
- links to