Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-19138

Python: new HiveContext will use a stopped SparkContext

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Duplicate
    • None
    • None
    • PySpark
    • None

    Description

      We have users that run a notebook cell that creates a new SparkContext to overwrite some of the default initial parameters:

      if 'sc' in globals():
          #Stop the running SparkContext if there is one running.
          sc.stop()
      
      conf = SparkConf().setAppName("app")
      #conf.set('spark.sql.shuffle.partitions', '2000')
      sc = SparkContext(conf=conf)
      sqlContext = HiveContext(sc)
      

      In Spark 2.0, this creates an invalid SQLContext that uses the original SparkContext because the HiveContext contstructor uses SparkSession.getOrCreate that has the old SparkContext. A SparkSession should be invalidated and no longer returned by getOrCreate if its SparkContext has been stopped.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              rdblue Ryan Blue
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: