Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-48390

SparkListenerBus not sending tableName details in logical plan for spark versions 3.4.2 and above

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Critical
    • Resolution: Unresolved
    • 3.4.2, 3.5.0, 3.5.1, 3.5.2, 3.4.3
    • None
    • Spark Core, SQL
    • None

    Description

      In OpenLineage, via SparkEventListener a logical plan event is received and by parsing it the frameworks deduces Input/Output table names to create a lineage.
      The issue is that in spark versions 3.4.2 and above (tested and reproducible in 3.4.2 & 3.5.0) the logical plan event sent by spark core is partial and is missing the tableName property which was been sent in earlier versions (working in spark 3.3.4).

      Note: This issue is only encountered in drop table events.

      For a drop table event, see below the logical plan in different spark versions

      Spark 3.3.4

      [
      {
      "class": "org.apache.spark.sql.execution.command.DropTableCommand",
      "num-children": 0,
      "tableName":
      
      { "product-class": "org.apache.spark.sql.catalyst.TableIdentifier", "table": "drop_table_test", "database": "default" }
      
      ,
      "ifExists": false,
      "isView": false,
      "purge": false
      }
      ]
      
      

      Spark 3.4.2

      [
      
      { "class": "org.apache.spark.sql.catalyst.plans.logical.DropTable", "num-children": 1, "child": 0, "ifExists": false, "purge": false }
      
      ,
      
      { "class": "org.apache.spark.sql.catalyst.analysis.ResolvedIdentifier", "num-children": 0, "catalog": null, "identifier": null }
      
      ]
      
      

      More details in referenced issue here: https://github.com/OpenLineage/OpenLineage/issues/2716

      Attachments

        Activity

          People

            Unassigned Unassigned
            mayurmadnani Mayur Madnani
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: