Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-7869

Spark Data Frame Fails to Load Postgres Tables with JSONB DataType Columns

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 1.3.0, 1.3.1
    • Fix Version/s: 1.6.0
    • Component/s: PySpark, SQL
    • Labels:
      None
    • Environment:

      Spark 1.3.1

    • Target Version/s:

      Description

      Most of our tables load into dataframes just fine with postgres. However we have a number of tables leveraging the JSONB datatype. Spark will error and refuse to load this table. While asking for Spark to support JSONB might be a tall order in the short term, it would be great if Spark would at least load the table ignoring the columns it can't load or have it be an option.

      pdf = sql_context.load(source="jdbc", url=url, dbtable="table_of_json")
      
      Py4JJavaError: An error occurred while calling o41.load.
      : java.sql.SQLException: Unsupported type 1111
          at org.apache.spark.sql.jdbc.JDBCRDD$.getCatalystType(JDBCRDD.scala:78)
          at org.apache.spark.sql.jdbc.JDBCRDD$.resolveTable(JDBCRDD.scala:112)
          at org.apache.spark.sql.jdbc.JDBCRelation.<init>(JDBCRelation.scala:133)
          at org.apache.spark.sql.jdbc.DefaultSource.createRelation(JDBCRelation.scala:121)
          at org.apache.spark.sql.sources.ResolvedDataSource$.apply(ddl.scala:219)
          at org.apache.spark.sql.SQLContext.load(SQLContext.scala:697)
          at org.apache.spark.sql.SQLContext.load(SQLContext.scala:685)
          at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
          at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
          at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
          at java.lang.reflect.Method.invoke(Method.java:606)
          at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:231)
          at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379)
          at py4j.Gateway.invoke(Gateway.java:259)
          at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133)
          at py4j.commands.CallCommand.execute(CallCommand.java:79)
          at py4j.GatewayConnection.run(GatewayConnection.java:207)
          at java.lang.Thread.run(Thread.java:745)
      

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                0x0fff Alexey Grishchenko
                Reporter:
                brdwrd Brad Willard
              • Votes:
                2 Vote for this issue
                Watchers:
                8 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: