Spark 3.4 history server's SQL tab incorrectly groups SQL executions when replaying event logs from Spark 3.3 and earlier



      In Spark 3.4.0 RC4, the Spark History Server's SQL tab incorrectly groups SQL executions when replaying event logs generated by older Spark versions.



      In ./bin/spark-shell --conf spark.eventLog.enabled=true --conf spark.eventLog.dir=eventlogs, run three non-nested SQL queries:

      sql("select * from range(10)").collect()
      sql("select * from range(20)").collect()
      sql("select * from range(30)").collect()

      Exit the shell and use the Spark History Server to replay this application's UI.

      In the SQL tab I expect to see three separate queries, but Spark 3.4's history server incorrectly groups the second and third queries as nested queries of the first (see attached screenshot).


      Root cause

      https://github.com/apache/spark/pull/39268 / SPARK-41752 added a new non-optional rootExecutionId: Long field to the SparkListenerSQLExecutionStart case class.

      When JsonProtocol deserializes this event it uses the "ignore missing properties" Jackson deserialization option, causing the rootExecutionField to be initialized with a default value of 0.

      The value 0 is a legitimate execution ID, so in the deserialized event we have no ability to distinguish between the absence of a value and a case where all queries have the first query as the root.

      Proposed fix:

      I think we should change this field to be of type Option[Long] . I believe this is a release blocker for Spark 3.4.0 because we cannot change the type of this new field in a future release without breaking binary compatibility.


