[SPARK-32439] Override datasource implementation during look up via configuration - ASF JIRA

XML

Word

Printable

JSON

We need a mechanism to override the datasource implementation via configuration.

For example, suppose I have a custom CSV datasource implementation called "my_csv". One way to use it is:

 val df = spark.read.format("my_csv").load(...)

Since the source data is the same format (CSV), you should be able to override the default implementation.

One proposal is to do the following:

spark.conf.set("spark.sql.datasource.override.csv", "my_csv") val df = spark.read.csv(...)

This has a benefit that the user doesn't have to change the application code to try out a new datasource implementation for the same source format.

links to

[Github] Pull Request #29236 (imback82)