Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-43267

Support creating data frame from a Postgres table that contains user-defined array column

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Resolved
    • Blocker
    • Resolution: Fixed
    • 2.4.0, 3.3.2
    • 3.5.0
    • SQL
    • None

    Description

      Spark SQL now doesn’t support creating data frame from a Postgres table that contains user-defined array column. However, it used to allow such type before the Postgres JDBC commit (https://github.com/pgjdbc/pgjdbc/commit/375cb3795c3330f9434cee9353f0791b86125914). The previous behavior was to handle user-defined array column as String.

      Given:

      Results:

      • Exception “java.sql.SQLException: Unsupported type ARRAY” is thrown

      Expectation after the change:

      • Function call succeeds
      • User-defined array is converted as a string in Spark DataFrame

      Suggested fix:

      • Update “getCatalystType” function in “PostgresDialect” as
        • val catalystType = toCatalystType(typeName.drop(1), size, scale).map(ArrayType(_))
          if (catalystType.isEmpty) Some(StringType) else catalystType

      Attachments

        Activity

          People

            fanjia Jia Fan
            sifhuang Sifan Huang
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: