Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-11788

Using java.sql.Timestamp and java.sql.Date in where clauses on JDBC dataframes causes SQLServerException

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.5.2
    • 1.5.3, 1.6.0
    • SQL
    • None

    Description

      I have a MSSQL table that has a timestamp column and am reading it using DataFrameReader.jdbc. Adding a where clause which compares a timestamp range causes a SQLServerException.

      The problem is in https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCRDD.scala#L264 (compileValue) which should surround timestamps/dates with quotes (only does it for strings).

      Sample pseudo-code:
      val beg = new java.sql.Timestamp(...)
      val end = new java.sql.Timestamp(...)
      val filtered = jdbcdf.where($"TIMESTAMP_COLUMN" >= beg && $"TIMESTAMP_COLUMN" < end)

      Generated SQL query: "TIMESTAMP_COLUMN >= 2015-01-01 00:00:00.0"
      Query should use quotes around timestamp: "TIMESTAMP_COLUMN >= '2015-01-01 00:00:00.0'"

      Fallback is to filter client-side which is extremely inefficient as the whole table needs to be downloaded to each Spark executor.

      Thanks

      Attachments

        Activity

          People

            huaxingao Huaxin Gao
            doctapp Martin Tapp
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: