Description
The specific observations below are based on Postgres 9.4 tables accessed via the postgresql-9.4-1201.jdbc41.jar driver. However, based on the behavior, I would expect the problem to exists for all external SQL databases.
- json and jsonb columns generate java.sql.SQLException: Unsupported type 1111. While it is reasonable to not support dynamic schema discovery of JSON columns automatically (it requires two passes over the data), a better behavior would be to create a String column and return the JSON.
- Array columns generate java.sql.SQLException: Unsupported type 2003. This is true even for simple types, e.g., text[]. A better behavior would be be create an Array column.
- Custom type columns are mapped to a String column. This behavior is harder to understand as the schema of a custom type is fixed and therefore mappable to a Struct column. The automatic conversion to a string is also inconsistent when compared to json and array column handling.
The exceptions are thrown by org.apache.spark.sql.jdbc.JDBCRDD$.org$apache$spark$sql$jdbc$JDBCRDD$$getCatalystType(JDBCRDD.scala:100) so this definitely looks like a Spark SQL and not a JDBC problem.
Attachments
Issue Links
- is duplicated by
-
SPARK-12266 cannot handle postgis raster type
- Resolved
-
SPARK-5753 add basic support to JDBCRDD for postgresql types: uuid, hstore, and array
- Resolved
- is related to
-
SPARK-11394 PostgreDialect cannot handle BYTE types
- Resolved
-
SPARK-7869 Spark Data Frame Fails to Load Postgres Tables with JSONB DataType Columns
- Resolved
- links to