Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-20362

spark submit not considering user defined Configs (Pyspark)

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Duplicate
    • Affects Version/s: 2.1.0
    • Fix Version/s: None
    • Component/s: PySpark
    • Labels:
      None

      Description

      I am trying to set up the custom configuration on runtime (pyspark), but in my spark UI <ip>:8080 i see my job is using complete node/cluster resources and application name is "test.py"(which is script name). It looks like the user defined configurations are not considered in job submit.

      command : spark-submit test.py
      standalone mode(2 nodes and 1 master)

      Here is the code:

      test.py
      from pyspark.sql import SparkSession
      from pyspark import SparkConf

      if _name_ == "_main_":
      conf = SparkConf().setAll([('spark.executor.memory', '8g'), ('spark.executor.cores', '3'), ('spark.cores.max', '10'), ('spark.driver.memory','8g')])
      spark = SparkSession.builder.config(conf=conf).enableHiveSupport().getOrCreate()
      sc = spark.sparkContext
      print(sc.getConf().getAll())
      sqlContext = SQLContext(sc)
      hiveContext = HiveContext(sc)
      print(hiveContext)
      print(sc.getConf().getAll())
      print("Complete")

      Print:

      [('spark.jars.packages', 'com.databricks:spark-csv_2.11:1.2.0'), ('spark.local.dir', '/mnt/sparklocaldir/'), ('hive.metastore.warehouse.dir', '<path>'), ('spark.app.id', 'app-20170417221942-0003'), ('spark.jars', 'file:/home/user/.ivy2/jars/com.databricks_spark-csv_2.11-1.2.0.jar,file:/home/user/.ivy2/jars/org.apache.commons_commons-csv-1.1.jar,file:/home/user/.ivy2/jars/com.univocity_univocity-parsers-1.5.1.jar'), ('spark.executor.id', 'driver'), ('spark.app.name', 'test.py'), ('spark.cores.max', '10'), ('spark.serializer', 'org.apache.spark.serializer.KryoSerializer'), ('spark.driver.port', '35596'), ('spark.sql.catalogImplementation', 'hive'), ('spark.sql.warehouse.dir', '<path>'), ('spark.rdd.compress', 'True'), ('spark.driver.memory', '8g'), ('spark.serializer.objectStreamReset', '100'), ('spark.executor.memory', '8g'), ('spark.executor.cores', '3'), ('spark.submit.deployMode', 'client'), ('spark.files', 'file:/home/user/test.py,file:/home/user/.ivy2/jars/com.databricks_spark-csv_2.11-1.2.0.jar,file:/home/user/.ivy2/jars/org.apache.commons_commons-csv-1.1.jar,file:/home/user/.ivy2/jars/com.univocity_univocity-parsers-1.5.1.jar'), ('spark.master', 'spark://master:7077'), ('spark.submit.pyFiles', '/home/user/.ivy2/jars/com.databricks_spark-csv_2.11-1.2.0.jar,/home/user/.ivy2/jars/org.apache.commons_commons-csv-1.1.jar,/home/user/.ivy2/jars/com.univocity_univocity-parsers-1.5.1.jar'), ('spark.driver.host', 'master')]

      <pyspark.sql.context.HiveContext object at 0x7f6f87b2e5f8>

      [('spark.jars.packages', 'com.databricks:spark-csv_2.11:1.2.0'), ('spark.local.dir', '/mnt/sparklocaldir/'), ('hive.metastore.warehouse.dir', '<path>'), ('spark.app.id', 'app-20170417221942-0003'), ('spark.jars', 'file:/home/user/.ivy2/jars/com.databricks_spark-csv_2.11-1.2.0.jar,file:/home/user/.ivy2/jars/org.apache.commons_commons-csv-1.1.jar,file:/home/user/.ivy2/jars/com.univocity_univocity-parsers-1.5.1.jar'), ('spark.executor.id', 'driver'), ('spark.app.name', 'test.py'), ('spark.cores.max', '10'), ('spark.serializer', 'org.apache.spark.serializer.KryoSerializer'), ('spark.driver.port', '35596'), ('spark.sql.catalogImplementation', 'hive'), ('spark.sql.warehouse.dir', '<path>'), ('spark.rdd.compress', 'True'), ('spark.driver.memory', '8g'), ('spark.serializer.objectStreamReset', '100'), ('spark.executor.memory', '8g'), ('spark.executor.cores', '3'), ('spark.submit.deployMode', 'client'), ('spark.files', 'file:/home/user/test.py,file:/home/user/.ivy2/jars/com.databricks_spark-csv_2.11-1.2.0.jar,file:/home/user/.ivy2/jars/org.apache.commons_commons-csv-1.1.jar,file:/home/user/.ivy2/jars/com.univocity_univocity-parsers-1.5.1.jar'), ('spark.master', 'spark://master:7077'), ('spark.submit.pyFiles', '/home/user/.ivy2/jars/com.databricks_spark-csv_2.11-1.2.0.jar,/home/user/.ivy2/jars/org.apache.commons_commons-csv-1.1.jar,/home/user/.ivy2/jars/com.univocity_univocity-parsers-1.5.1.jar'), ('spark.driver.host', 'master')]

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                harishk15 Harish
              • Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: