Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-25003

Pyspark Does not use Spark Sql Extensions

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.2.2, 2.3.1
    • Fix Version/s: 3.0.0
    • Component/s: PySpark
    • Labels:
      None

      Description

      When creating a SparkSession here

      https://github.com/apache/spark/blob/v2.2.2/python/pyspark/sql/session.py#L216

      if jsparkSession is None:
        jsparkSession = self._jvm.SparkSession(self._jsc.sc())
      self._jsparkSession = jsparkSession
      

      I believe it ends up calling the constructor here
      https://github.com/apache/spark/blob/v2.2.2/sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala#L85-L87

        private[sql] def this(sc: SparkContext) {
          this(sc, None, None, new SparkSessionExtensions)
        }
      

      Which creates a new SparkSessionsExtensions object and does not pick up new extensions that could have been set in the config like the companion getOrCreate does.
      https://github.com/apache/spark/blob/v2.2.2/sql/core/src/main/scala/org/apache/spark/sql/SparkSession.scala#L928-L944

      //in getOrCreate
              // Initialize extensions if the user has defined a configurator class.
              val extensionConfOption = sparkContext.conf.get(StaticSQLConf.SPARK_SESSION_EXTENSIONS)
              if (extensionConfOption.isDefined) {
                val extensionConfClassName = extensionConfOption.get
                try {
                  val extensionConfClass = Utils.classForName(extensionConfClassName)
                  val extensionConf = extensionConfClass.newInstance()
                    .asInstanceOf[SparkSessionExtensions => Unit]
                  extensionConf(extensions)
                } catch {
                  // Ignore the error if we cannot find the class or when the class has the wrong type.
                  case e @ (_: ClassCastException |
                            _: ClassNotFoundException |
                            _: NoClassDefFoundError) =>
                    logWarning(s"Cannot use $extensionConfClassName to configure session extensions.", e)
                }
              }
      

      I think a quick fix would be to use the getOrCreate method from the companion object instead of calling the constructor from the SparkContext. Or we could fix this by ensuring that all constructors attempt to pick up custom extensions if they are set.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                rspitzer Russell Spitzer
                Reporter:
                rspitzer Russell Spitzer
              • Votes:
                1 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: