Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-5981

pyspark ML models should support predict/transform on vector within map

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Won't Fix
    • 1.3.0
    • None
    • MLlib, PySpark
    • None

    Description

      Currently, most Python models only have limited support for single-vector prediction.
      E.g., one can call

      model.predict(myFeatureVector)

      for a single instance, but that fails within a map for Python ML models and transformers which use JavaModelWrapper:

      data.map(lambda features: model.predict(features))
      

      This fails because JavaModelWrapper.call uses the SparkContext (within the transformation). (It works for linear models, which do prediction within Python.)

      Supporting prediction within a map would require storing the model and doing prediction/transformation within Python.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              josephkb Joseph K. Bradley
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: