Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-23228

Able to track Python create SparkSession in JVM

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 2.4.0
    • 2.4.0
    • PySpark
    • None

    Description

      Currently when we write a SparkListener which invokes SparkSession and loaded in PySpark application. This SparkListener will fail to get SparkSession created by PySpark, so the below assert will throw an exception. To avoid such issue, we should add PySpark created SparkSession into JVM defaultSession.

      spark.sql("CREATE TABLE test (a INT)")
      
      class TestSparkSession extends SparkListener with Logging {
        override def onOtherEvent(event: SparkListenerEvent): Unit = {
          event match {
            case CreateTableEvent(db, table) =>
              val session = SparkSession.getActiveSession.orElse(SparkSession.getDefaultSession).get
              assert(session != null)
              val tableInfo = session.sharedState.externalCatalog.getTable(db, table)
              logInfo(s"Table info ${tableInfo}")
      
            case e =>
              logInfo(s"event $e")
      
          }
        }
      }
      

      Attachments

        Activity

          People

            jerryshao Saisai Shao
            jerryshao Saisai Shao
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: