Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-44076 SPIP: Python Data Source API
  3. SPARK-45713

Support registering Python data sources

    XMLWordPrintableJSON

Details

    Description

      Support registering Python data sources.

      Users can register a Python data source and later use reference it using its name.

      class MyDataSource(DataSource):
          @classmethod
          def name(cls):
              return "my-data-source"
      
      spark.dataSource.register(MyDataSource)

      Users can then use the name of the data source as the format (will be supported in SPARK-45639)

      spark.read.format("my-data-source").load()

      Attachments

        Activity

          People

            allisonwang-db Allison Wang
            allisonwang-db Allison Wang
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: