[SPARK-24423] Add a new option `query` for JDBC sources - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 2.3.0
Fix Version/s: 2.4.0
Component/s: SQL
Labels:
None

Target Version/s:

2.4.0

Description

Currently, our JDBC connector provides the option `dbtable` for users to specify the to-be-loaded JDBC source table.

 
 val jdbcDf = spark.read
   .format("jdbc")
   .option("*dbtable*", "dbName.tableName")
   .options(jdbcCredentials: Map)
   .load()

Normally, users do not fetch the whole JDBC table due to the poor performance/throughput of JDBC. Thus, they normally just fetch a small set of tables. For advanced users, they can pass a subquery as the option.

 
 val query = """ (select * from tableName limit 10) as tmp """
 val jdbcDf = spark.read
   .format("jdbc")
   .option("*dbtable*", query)
   .options(jdbcCredentials: Map)
   .load()

However, this is straightforward to end users. We should simply allow users to specify the query by a new option `query`. We will handle the complexity for them.

 
 val query = """select * from tableName limit 10"""
 val jdbcDf = spark.read
   .format("jdbc")
   .option("*{color:#ff0000}query{color}*", query)
   .options(jdbcCredentials: Map)
   .load()

Users are not allowed to specify query and dbtable at the same time.

Attachments

Issue Links

is duplicated by

SPARK-8324 Register Query as view through JDBC interface

Resolved

links to

[Github] Pull Request #21590 (dilipbiswal)

[Github] Pull Request #23170 (wangyum)

Activity

People

Assignee:: Dilip Biswal

Reporter:: Xiao Li

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Dates

Created:: 30/May/18 06:17

Updated:: 04/Jan/19 14:43

Resolved:: 26/Jun/18 22:17