Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-15573

Backwards-compatible persistence for spark.ml

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Incomplete
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: ML
    • Labels:

      Description

      This JIRA is for imposing backwards-compatible persistence for the DataFrames-based API for MLlib. I.e., we want to be able to load models saved in previous versions of Spark. We will not require loading models saved in later versions of Spark.

      This requires:

      • Putting unit tests in place to check loading models from previous versions
      • Notifying all committers active on MLlib to be aware of this requirement in the future

      The unit tests could be written as in spark.mllib, where we essentially copied and pasted the save() code every time it changed. This happens rarely, so it should be acceptable, though other designs are fine.

      Subtasks of this JIRA should cover checking and adding tests for existing cases, such as KMeansModel (whose format changed between 1.6 and 2.0).

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                josephkb Joseph K. Bradley
              • Votes:
                0 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: