Details
-
Bug
-
Status: Resolved
-
Minor
-
Resolution: Incomplete
-
2.1.0
-
None
Description
If a date literal is used in a pushed-down filter expression, e.g.
val postingDate = java.sql.Date.valueOf("2016-06-03") val count = jdbcDF.filter($"POSTINGDATE" === postingDate).count
where the POSTINGDATE column is of JDBC type Date, the resulting pushed-down SQL query looks like the following:
SELECT .. <columns> ... FROM <table> WHERE POSTINGDATE = '2016-06-03'
Specifically, the date is compiled into a string literal using the hardcoded yyyy-MM-dd format that java.sql.Date.toString emits. Note the implied string conversion for date (and timestamp) values in JDBCRDD.compileValue
/** * Converts value to SQL expression. */ private def compileValue(value: Any): Any = value match { case stringValue: String => s"'${escapeSql(stringValue)}'" case timestampValue: Timestamp => "'" + timestampValue + "'" case dateValue: Date => "'" + dateValue + "'" case arrayValue: Array[Any] => arrayValue.map(compileValue).mkString(", ") case _ => value }
The resulting query fails if the database is expecting a different format for date string literals. For example, the default format for Oracle is 'dd-MMM-yy', so when the relation query is executed, it fails with a syntax error.
ORA-01861: literal does not match format string
01861. 00000 - "literal does not match format string"
In some situations it may be possible to change the database's expected date format to match the Java format, but this is not always possible (e.g. reading from an external database server)
Shouldn't this kind of conversion be going through some kind of vendor specific translation (e.g. through a JDBCDialect)?
Attachments
Issue Links
- relates to
-
SPARK-18004 DataFrame filter Predicate push-down fails for Oracle Timestamp type columns
- Resolved