Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-26540

Support PostgreSQL numeric arrays without precision/scale

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Duplicate
    • 2.0.2, 2.1.3, 2.2.2, 2.3.2, 2.4.0
    • None
    • SQL
    • None

    Description

      This bug was reported in spark-user: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-jdbc-postgres-numeric-array-td34280.html

      To reproduce this;

      // Creates a table in a PostgreSQL shell
      postgres=# CREATE TABLE t (v numeric[], d  numeric);
      CREATE TABLE
      postgres=# INSERT INTO t VALUES('{1111.222,2222.332}', 222.4555);
      INSERT 0 1
      postgres=# SELECT * FROM t;
                v          |    d     
      ---------------------+----------
       {1111.222,2222.332} | 222.4555
      (1 row)
      
      postgres=# \d t
              Table "public.t"
       Column |   Type    | Modifiers 
      --------+-----------+-----------
       v      | numeric[] | 
       d      | numeric   | 
      
      // Then, reads it in Spark
      ./bin/spark-shell --jars=postgresql-42.2.4.jar -v
      
      scala> import java.util.Properties
      scala> val options = new Properties();
      scala> options.setProperty("driver", "org.postgresql.Driver")
      scala> options.setProperty("user", "maropu")
      scala> options.setProperty("password", "")
      scala> val pgTable = spark.read.jdbc("jdbc:postgresql:postgres", "t", options)
      scala> pgTable.printSchema
      root
       |-- v: array (nullable = true)
       |    |-- element: decimal(0,0) (containsNull = true)
       |-- d: decimal(38,18) (nullable = true)
      
      scala> pgTable.show
      9/01/05 09:16:34 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0)
      java.lang.IllegalArgumentException: requirement failed: Decimal precision 4 exceeds max precision 0
      	at scala.Predef$.require(Predef.scala:281)
      	at org.apache.spark.sql.types.Decimal.set(Decimal.scala:116)
      	at org.apache.spark.sql.types.Decimal$.apply(Decimal.scala:465)
      ...
      

      I looked over the related code and then I think we need more logics to handle numeric arrays;
      https://github.com/apache/spark/blob/2a30deb85ae4e42c5cbc936383dd5c3970f4a74f/sql/core/src/main/scala/org/apache/spark/sql/jdbc/PostgresDialect.scala#L41

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              maropu Takeshi Yamamuro
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: