Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
3.2.1, 4.0.0
Description
Spark history sever fails to display query for cached JDBC relation (or calculation derived from it) named in quotes
(Screenshot and generated history in attachments)
How to reproduce:
val ticketsDf = spark.read.jdbc("jdbc:postgresql://localhost:5432/demo", """ "test-schema".tickets """.trim, properties) val bookingDf = spark.read.parquet("path/bookings") ticketsDf.cache().count() val resultDf = bookingDf.join(ticketsDf, Seq("book_ref")) resultDf.write.mode(SaveMode.Overwrite).parquet("path/result")
So the problem is in SparkPlanGraphNode class which creates a dot node. When there is no metrics to display it simply returns tagged name and in this case name contains quotes which corrupts dot file.
Suggested solution is to escape name string