Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-12437

Reserved words (like table) throws error when writing a data frame to JDBC

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Duplicate
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: SQL
    • Labels:

      Description

      From: A Spark user

      If you have a DataFrame column name that contains a SQL reserved word, it will not write to a JDBC source. This is somewhat similar to an error found in the redshift adapter:
      https://github.com/databricks/spark-redshift/issues/80

      I have produced this on a MySQL (AWS Aurora) database

      Steps to reproduce:

      val connectionProperties = new java.util.Properties()
      sqlContext.table("diamonds").write.jdbc(jdbcUrl, "diamonds", connectionProperties)
      

      Exception:

      com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'table DOUBLE PRECISION , price INTEGER , x DOUBLE PRECISION , y DOUBLE PRECISION' at line 1
      	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
      	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
      	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
      	at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
      	at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)
      	at com.mysql.jdbc.Util.getInstance(Util.java:386)
      	at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1054)
      	at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4237)
      	at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4169)
      	at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2617)
      	at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2778)
      	at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2825)
      	at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:2156)
      	at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:2459)
      	at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:2376)
      	at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:2360)
      	at org.apache.spark.sql.DataFrameWriter.jdbc(DataFrameWriter.scala:275)
      

      You can workaround this by renaming the column on the dataframe before writing, but ideally we should be able to do something like encapsulate the name in quotes which is allowed. Example:

      CREATE TABLE `test_table_column` (
        `id` int(11) DEFAULT NULL,
        `table` varchar(100) DEFAULT NULL
      ) 
      

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                apachespark Apache Spark
                Reporter:
                rxin Reynold Xin
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: