Description
Currently, our JDBC connector provides the option `dbtable` for users to specify the to-be-loaded JDBC source table.
val jdbcDf = spark.read .format("jdbc") .option("*dbtable*", "dbName.tableName") .options(jdbcCredentials: Map) .load()
Normally, users do not fetch the whole JDBC table due to the poor performance/throughput of JDBC. Thus, they normally just fetch a small set of tables. For advanced users, they can pass a subquery as the option.
val query = """ (select * from tableName limit 10) as tmp """ val jdbcDf = spark.read .format("jdbc") .option("*dbtable*", query) .options(jdbcCredentials: Map) .load()
However, this is straightforward to end users. We should simply allow users to specify the query by a new option `query`. We will handle the complexity for them.
val query = """select * from tableName limit 10""" val jdbcDf = spark.read .format("jdbc") .option("*{color:#ff0000}query{color}*", query) .options(jdbcCredentials: Map) .load()
Users are not allowed to specify query and dbtable at the same time.
Attachments
Issue Links
- is duplicated by
-
SPARK-8324 Register Query as view through JDBC interface
-
- Resolved
-
- links to