[SPARK-23674] Add Spark ML Listener for Tracking ML Pipeline Status - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 2.3.0
Fix Version/s: 3.0.0
Component/s: ML
Labels:
None

Description

Currently, Spark provides status monitoring for different components of Spark, like spark history server, streaming listener, sql listener and etc. The use case would be (1) front UI to track the status of training coverage rate during iteration, then DS can understand how the job converge when training, like K-means, Logistic and other linear regression model. (2) tracking the data lineage for the input and output of training data.

In this proposal, we hope to provide Spark ML pipeline listener to track the status of Spark ML pipeline status includes: