Details
-
New Feature
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
Description
As a Java developer, I would like the ability to use JDBC to interact with Flight servers. For example, there is now an example in the Arrow repo to run a Flight server wrapping DataFusion and it supports executing SQL against CSV and Parquet files. I would like to be able to call this from Java.
A flight Arrow JDBC driver would also then simplify developing integrations with other Apache projects, such as building a Spark V2 Data Source or a Drill storage plugin. It would also be directly usable from many BI tools.
I propose that the class name of the driver should be "org.apache.arrow.jdbc.Driver" and the connection string should be "jdbc:arrow://host:port?[properties]". I'm purposely leaving "flight" out of these because I don't think it makes sense to support multiple protocols now that we have flight and it is easier for users to remember "arrow" rather than needing to know about the protocol. This is easy to change if there are objections.
JDBC is designed around sending queries as strings and then receiving results. These strings could be SQL queries, JSON-encoded query plans, or something else. The JDBC driver will not make any assumptions about the format or dialect of these strings. Queries would be executed using the "DoGet" method.
The JDBC metadata functionality for reading schema information could possibly use ListFlights but I haven't looked into this part yet.
I do expect that this JDBC driver will serve as a base that could be extended to add specific functionality for different Flight servers rather than attempt to support them all.
Attachments
Issue Links
- is related to
-
ARROW-15111 [C++] Implement ODBC driver "wrapper" using FlightSQL
- Open
- relates to
-
ARROW-9825 [FlightRPC] Add a "Flight SQL" extension on top of FlightRPC
- Resolved
- supercedes
-
ARROW-15452 [FlightRPC][Java] JDBC driver for Flight SQL
- Closed
- links to