Details
-
Bug
-
Status: Resolved
-
Minor
-
Resolution: Duplicate
-
2.0.2, 2.1.3, 2.2.2, 2.3.2, 2.4.0
-
None
-
None
Description
This bug was reported in spark-user: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-jdbc-postgres-numeric-array-td34280.html
To reproduce this;
// Creates a table in a PostgreSQL shell postgres=# CREATE TABLE t (v numeric[], d numeric); CREATE TABLE postgres=# INSERT INTO t VALUES('{1111.222,2222.332}', 222.4555); INSERT 0 1 postgres=# SELECT * FROM t; v | d ---------------------+---------- {1111.222,2222.332} | 222.4555 (1 row) postgres=# \d t Table "public.t" Column | Type | Modifiers --------+-----------+----------- v | numeric[] | d | numeric | // Then, reads it in Spark ./bin/spark-shell --jars=postgresql-42.2.4.jar -v scala> import java.util.Properties scala> val options = new Properties(); scala> options.setProperty("driver", "org.postgresql.Driver") scala> options.setProperty("user", "maropu") scala> options.setProperty("password", "") scala> val pgTable = spark.read.jdbc("jdbc:postgresql:postgres", "t", options) scala> pgTable.printSchema root |-- v: array (nullable = true) | |-- element: decimal(0,0) (containsNull = true) |-- d: decimal(38,18) (nullable = true) scala> pgTable.show 9/01/05 09:16:34 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0) java.lang.IllegalArgumentException: requirement failed: Decimal precision 4 exceeds max precision 0 at scala.Predef$.require(Predef.scala:281) at org.apache.spark.sql.types.Decimal.set(Decimal.scala:116) at org.apache.spark.sql.types.Decimal$.apply(Decimal.scala:465) ...
I looked over the related code and then I think we need more logics to handle numeric arrays;
https://github.com/apache/spark/blob/2a30deb85ae4e42c5cbc936383dd5c3970f4a74f/sql/core/src/main/scala/org/apache/spark/sql/jdbc/PostgresDialect.scala#L41
Attachments
Issue Links
- duplicates
-
SPARK-26538 Postgres numeric array support
- Resolved
- links to