Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-19369

SparkConf not getting properly initialized in PySpark 2.1.0

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Duplicate
    • 2.1.0
    • None
    • PySpark
    • Windows/Linux

    Description

      Trying to migrate from Spark 1.6 to 2.1, I've stumbled upon a small problem - my SparkContext doesn't get its configurations from the SparkConf object. Before passing them onto to the SparkContext constructor, I've made sure my configuration are set.

      I've done some digging and this is what I've found:

      When I initialize the SparkContext, the following code is executed:

      def _do_init(self, master, appName, sparkHome, pyFiles, environment, batchSize, serializer,
      conf, jsc, profiler_cls):
      self.environment = environment or {}
      if conf is not None and conf._jconf is not None:
      self._conf = conf
      else:
      self._conf = SparkConf(_jvm=SparkContext._jvm)

      So I can see that the only way that my SparkConf will be used is if it also has a _jvm object.
      I've used spark-submit to submit my job and printed the _jvm object but it is null, which explains why my SparkConf object is ignored.
      I've tried running exactly the same on Spark 2.0.1 and it worked! My SparkConf object had a valid _jvm object.

      Am i doing something wrong or is this a bug?

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned
            sidfeiner Sidney Feiner
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment