Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-12437

Reserved words (like table) throws error when writing a data frame to JDBC

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Duplicate
    • None
    • None
    • SQL

    Description

      From: A Spark user

      If you have a DataFrame column name that contains a SQL reserved word, it will not write to a JDBC source. This is somewhat similar to an error found in the redshift adapter:
      https://github.com/databricks/spark-redshift/issues/80

      I have produced this on a MySQL (AWS Aurora) database

      Steps to reproduce:

      val connectionProperties = new java.util.Properties()
      sqlContext.table("diamonds").write.jdbc(jdbcUrl, "diamonds", connectionProperties)
      

      Exception:

      com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'table DOUBLE PRECISION , price INTEGER , x DOUBLE PRECISION , y DOUBLE PRECISION' at line 1
      	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
      	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
      	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
      	at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
      	at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)
      	at com.mysql.jdbc.Util.getInstance(Util.java:386)
      	at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1054)
      	at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4237)
      	at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4169)
      	at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2617)
      	at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2778)
      	at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2825)
      	at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:2156)
      	at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:2459)
      	at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:2376)
      	at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:2360)
      	at org.apache.spark.sql.DataFrameWriter.jdbc(DataFrameWriter.scala:275)
      

      You can workaround this by renaming the column on the dataframe before writing, but ideally we should be able to do something like encapsulate the name in quotes which is allowed. Example:

      CREATE TABLE `test_table_column` (
        `id` int(11) DEFAULT NULL,
        `table` varchar(100) DEFAULT NULL
      ) 
      

      Attachments

        Issue Links

          Activity

            People

              apachespark Apache Spark
              rxin Reynold Xin
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: