Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-23674

Add Spark ML Listener for Tracking ML Pipeline Status

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.3.0
    • 3.0.0
    • ML
    • None

    Description

      Currently, Spark provides status monitoring for different components of Spark, like spark history server, streaming listener, sql listener and etc. The use case would be (1) front UI to track the status of training coverage rate during iteration, then DS can understand how the job converge when training, like K-means, Logistic and other linear regression model.  (2) tracking the data lineage for the input and output of training data.  

      In this proposal, we hope to provide Spark ML pipeline listener to track the status of Spark ML pipeline status includes: 

      1. ML pipeline create and saved 
      2. ML pipeline model created, saved and load  
      3. ML model training status monitoring  

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            gurwls223 Hyukjin Kwon
            merlin Mingjie Tang
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment