Details
-
Bug
-
Status: Resolved
-
Minor
-
Resolution: Cannot Reproduce
-
3.3.0
-
None
-
None
Description
On Spark 3.3.0, when reading from a JDBC table(used SQLite to repro) using spark.read.jdbc command with sqlite-jdbc:3.34.0.jar on a table and column name containing special characters. Dataframe API filter criteria fails with parse Exception
[#Script:]
from pyspark.sql import SparkSession spark = SparkSession \ .builder \ .appName("Databricks Support") \ .config("spark.jars.packages", "org.xerial:sqlite-jdbc:3.34.0") \ .getOrCreate() columns = ["id", "/abc/column", "value"] data = [(1, 'A', 100), (2, 'B', 200), (3, 'B', 300)] rdd = spark.sparkContext.parallelize(data) df = spark.createDataFrame(rdd).toDF(*columns) options = {"url": "jdbc:sqlite:/<local-path>/spark-3.3.1-bin-hadoop3/jars/test.db", "dbtable": '"/abc/table"', "driver": "org.sqlite.JDBC"} df.coalesce(1).write.format("jdbc").options(**options).mode("append").save() df_1 = spark.read.format("jdbc") \ .option("url", "jdbc:sqlite:/<local-path>/spark-3.3.1-bin-hadoop3/jars/test.db") \ .option("dbtable", '"/abc/table"') \ .option("driver", "org.sqlite.JDBC") \ .load() df_2 = df_1.filter("`/abc/column` = 'B'") df_2.show()
Error:
``` Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/opt/homebrew/Cellar/apache-spark/3.3.1/libexec/python/pyspark/sql/dataframe.py", line 606, in show print(self._jdf.showString(n, 20, vertical)) File "/opt/homebrew/Cellar/apache-spark/3.3.1/libexec/python/lib/py4j-0.10.9.5-src.zip/py4j/java_gateway.py", line 1321, in __call__ File "/opt/homebrew/Cellar/apache-spark/3.3.1/libexec/python/pyspark/sql/utils.py", line 196, in deco raise converted from None pyspark.sql.utils.ParseException: Syntax error at or near '/': extra input '/'(line 1, pos 0) == SQL == /abc/column ^^^```
However, when using Spark 3.2.1, we are able to successfully apply dataframe.filter option
>>> df_2.show() +---+-----------+-----+ | id|/abc/column|value| +---+-----------+-----+ | 2| B| 200| | 3| B| 300| +---+-----------+-----+
Repro steps:
- Download Spark 3.2.1 in local
- Download and Copy the sqlite-jdbc:3.34.0.jar into the jar folder present in the local spark download folder
- Run the above script by providing the jar path
- This will create a /abc/table with column /abc/column and returns result when applying filter criteria
- Download spark ** 3.3.0 in local
- Repeat #2, #3
- Fails with parse exception.
could you please let us know how we can filter on the special characters column or escape them on spark version 3.3.0?
Attachments
Issue Links
- duplicates
-
SPARK-41990 Filtering by composite field name like `field name` doesn't work with pushDownPredicate = true
- Resolved