Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-10186

Add support for more postgres column types

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.4.1
    • 1.6.0
    • SQL
    • None
    • Ubuntu on AWS

    Description

      The specific observations below are based on Postgres 9.4 tables accessed via the postgresql-9.4-1201.jdbc41.jar driver. However, based on the behavior, I would expect the problem to exists for all external SQL databases.

      • json and jsonb columns generate java.sql.SQLException: Unsupported type 1111. While it is reasonable to not support dynamic schema discovery of JSON columns automatically (it requires two passes over the data), a better behavior would be to create a String column and return the JSON.
      • Array columns generate java.sql.SQLException: Unsupported type 2003. This is true even for simple types, e.g., text[]. A better behavior would be be create an Array column.
      • Custom type columns are mapped to a String column. This behavior is harder to understand as the schema of a custom type is fixed and therefore mappable to a Struct column. The automatic conversion to a string is also inconsistent when compared to json and array column handling.

      The exceptions are thrown by org.apache.spark.sql.jdbc.JDBCRDD$.org$apache$spark$sql$jdbc$JDBCRDD$$getCatalystType(JDBCRDD.scala:100) so this definitely looks like a Spark SQL and not a JDBC problem.

      Attachments

        Issue Links

          Activity

            People

              mariusvniekerk Marius van Niekerk
              simeons Simeon Simeonov
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: