Uploaded image for project: 'PredictionIO (Retired)'
  1. PredictionIO (Retired)
  2. PIO-192

Enhance PySpark support

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Done
    • 0.13.0
    • 0.14.0
    • Core
    • None

    Description

      Summary

      Enhance the pypio, which is the Python API for PIO.

      Goals

      The limitations of the current Python support always force developers to have access to sbt. This enhancement will get rid of the build phase.

      Description

      A Python engine has nothing to need. Developers can use the pypio module with jupyter notebook and Python code.

      First, import the necessary modules.

      import pypio
      

      Once the module in imported, the first step is to initialize the pypio module.

      pypio.init()
      

      Next, find data from the event store.

      event_df = pypio.find_events('BHPApp')
      

      And then, save the model.

      # model is a PipelineModel, which is produced after a Pipeline’s fit() method runs
      pipeline = Pipeline(...)
      model = pipeline.fit(train_df)
      engine_instance_id = pypio.save_model(model, ["prediction"])
      

      Run & Deploy

      Run Jupyter
      pio-shell --with-pyspark
      
      Run on Spark
      pio train --main-py-file xxxx.py
      
      Deploy App
      pio deploy --engine-instance-id <engine_instance_id>
      

      Attachments

        Issue Links

          Activity

            People

              shimamoto Takako Shimamoto
              shimamoto Takako Shimamoto
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: