Currently, there is no way to easily persist Json-serializable parameters in Python only. All parameters in Python are persisted by converting them to Java objects and using the Java persistence implementation. In order to facilitate the creation of custom Python-only pipeline stages, it would be good to have a Python-only persistence framework so that these stages do not need to be implemented in Scala for persistence.
This task involves:
- Adding implementations for DefaultParamsReadable, DefaultParamsWriteable, DefaultParamsReader, and DefaultParamsWriter in pyspark.
- relates to
SPARK-17025 Cannot persist PySpark ML Pipeline model that includes custom Transformer
- links to