Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-17025

Cannot persist PySpark ML Pipeline model that includes custom Transformer

    XMLWordPrintableJSON

    Details

    • Type: New Feature
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 2.0.0
    • Fix Version/s: 2.3.0
    • Component/s: ML, PySpark
    • Labels:
      None

      Description

      Following the example in this Databricks blog post under "Python tuning", I'm trying to save an ML Pipeline model.

      This pipeline, however, includes a custom transformer. When I try to save the model, the operation fails because the custom transformer doesn't have a _to_java attribute.

      Traceback (most recent call last):
        File ".../file.py", line 56, in <module>
          model.bestModel.save('model')
        File "/usr/local/Cellar/apache-spark/2.0.0/libexec/python/lib/pyspark.zip/pyspark/ml/pipeline.py", line 222, in save
        File "/usr/local/Cellar/apache-spark/2.0.0/libexec/python/lib/pyspark.zip/pyspark/ml/pipeline.py", line 217, in write
        File "/usr/local/Cellar/apache-spark/2.0.0/libexec/python/lib/pyspark.zip/pyspark/ml/util.py", line 93, in __init__
        File "/usr/local/Cellar/apache-spark/2.0.0/libexec/python/lib/pyspark.zip/pyspark/ml/pipeline.py", line 254, in _to_java
      AttributeError: 'PeoplePairFeaturizer' object has no attribute '_to_java'
      

      Looking at the source code for ml/base.py, I see that not even the base Transformer class has such an attribute.

      I'm assuming this is missing functionality that is intended to be patched up (i.e. like this).

      I'm not sure if there is an existing JIRA for this (my searches didn't turn up clear results).

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                ajaysaini Ajay Saini
                Reporter:
                nchammas Nicholas Chammas
              • Votes:
                7 Vote for this issue
                Watchers:
                17 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: