Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-25003

Pyspark Does not use Spark Sql Extensions

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.2.2, 2.3.1
    • 3.0.0
    • PySpark
    • None

    Description

      When creating a SparkSession here

      https://github.com/apache/spark/blob/v2.2.2/python/pyspark/sql/session.py#L216

      if jsparkSession is None:
        jsparkSession = self._jvm.SparkSession(self._jsc.sc())
      self._jsparkSession = jsparkSession
      

      I believe it ends up calling the constructor here
      https://github.com/apache/spark/blob/v2.2.2/sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala#L85-L87

        private[sql] def this(sc: SparkContext) {
          this(sc, None, None, new SparkSessionExtensions)
        }
      

      Which creates a new SparkSessionsExtensions object and does not pick up new extensions that could have been set in the config like the companion getOrCreate does.
      https://github.com/apache/spark/blob/v2.2.2/sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala#L928-L944

      //in getOrCreate
              // Initialize extensions if the user has defined a configurator class.
              val extensionConfOption = sparkContext.conf.get(StaticSQLConf.SPARK_SESSION_EXTENSIONS)
              if (extensionConfOption.isDefined) {
                val extensionConfClassName = extensionConfOption.get
                try {
                  val extensionConfClass = Utils.classForName(extensionConfClassName)
                  val extensionConf = extensionConfClass.newInstance()
                    .asInstanceOf[SparkSessionExtensions => Unit]
                  extensionConf(extensions)
                } catch {
                  // Ignore the error if we cannot find the class or when the class has the wrong type.
                  case e @ (_: ClassCastException |
                            _: ClassNotFoundException |
                            _: NoClassDefFoundError) =>
                    logWarning(s"Cannot use $extensionConfClassName to configure session extensions.", e)
                }
              }
      

      I think a quick fix would be to use the getOrCreate method from the companion object instead of calling the constructor from the SparkContext. Or we could fix this by ensuring that all constructors attempt to pick up custom extensions if they are set.

      Attachments

        Issue Links

          Activity

            People

              rspitzer Russell Spitzer
              rspitzer Russell Spitzer
              Votes:
              1 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: