Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-44076

SPIP: Python Data Source API

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 4.0.0
    • None
    • PySpark
    • None

    Description

      This proposal aims to introduce a simple API in Python for Data Sources. The idea is to enable Python developers to create data sources without having to learn Scala or deal with the complexities of the current data source APIs. The goal is to make a Python-based API that is simple and easy to use, thus making Spark more accessible to the wider Python developer community. This proposed approach is based on the recently introduced Python user-defined table functions (SPARK-43797) with extensions to support data sources.

      SPIP: https://docs.google.com/document/d/1oYrCKEKHzznljYfJO4kx5K_Npcgt1Slyfph3NEk7JRU/edit?usp=sharing

      Attachments

        Activity

          People

            Unassigned Unassigned
            allisonwang-db Allison Wang
            Hyukjin Kwon Hyukjin Kwon
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: