Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-23228

Able to track Python create SparkSession in JVM

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 2.4.0
    • Fix Version/s: 2.4.0
    • Component/s: PySpark
    • Labels:
      None

      Description

      Currently when we write a SparkListener which invokes SparkSession and loaded in PySpark application. This SparkListener will fail to get SparkSession created by PySpark, so the below assert will throw an exception. To avoid such issue, we should add PySpark created SparkSession into JVM defaultSession.

      spark.sql("CREATE TABLE test (a INT)")
      
      class TestSparkSession extends SparkListener with Logging {
        override def onOtherEvent(event: SparkListenerEvent): Unit = {
          event match {
            case CreateTableEvent(db, table) =>
              val session = SparkSession.getActiveSession.orElse(SparkSession.getDefaultSession).get
              assert(session != null)
              val tableInfo = session.sharedState.externalCatalog.getTable(db, table)
              logInfo(s"Table info ${tableInfo}")
      
            case e =>
              logInfo(s"event $e")
      
          }
        }
      }
      

        Attachments

          Activity

            People

            • Assignee:
              jerryshao Saisai Shao
              Reporter:
              jerryshao Saisai Shao
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: