Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-50092

Fix PostgreSQL connector behaviour for multidimensional arrays

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 4.0.0
    • 4.0.0
    • SQL

    Description

      There is a bug introduced in this PR https://github.com/apache/spark/pull/46006. This PR fixed the behaviour for PostgreSQL connector for multidimensional arrays since we have mapped all arrays to 1D arrays.

      This PR has introduced a bug. Following scenario is broken:

      • User has a table t1 on Postgres and does CTAS command to create table t2 with same data.
      • PR 46006 is resolving the dimensionality of column by reading the metadata from pg_attribute table and attndims column.
      • This query returns correct dimensionality for table t1, but for table t2 that is created via CTAS it returns 0 always. This leads to all of the arrays being mapped to 0-D array which is the type itself (for example int)

      As a solution, we can query array_ndims function on given column that will return the dimension of the column. It works for CTAS-like-created tables too.

      Attachments

        Issue Links

          Activity

            People

              petarvasiljevic Petar Vasiljevic
              petarvasiljevic Petar Vasiljevic
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: