Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-7706

Allow setting YARN_CONF_DIR from spark argument

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Minor
    • Resolution: Incomplete
    • 1.3.1
    • None
    • Spark Submit

    Description

      Currently in SparkSubmitArguments.scala when master is set to "yarn" (yarn-cluster mode)
      https://github.com/apache/spark/blob/b1f4ca82d170935d15f1fe6beb9af0743b4d81cd/core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala#L236
      Spark checks if YARN_CONF_DIR or HADOOP_CONF_DIR is set in EVN.

      However we should additionally allow passing YARN_CONF_DIR from command line argument this is particularly handy when Spark is being launched from schedulers like OOZIE or FALCON.

      Reason being, oozie launcher App starts in one of the container assigned by Yarn RM and we do not want to set YARN_CONF_DIR in ENV for all the nodes in cluster. Just passing the argument like -yarnconfdir with conf dir (ex: /etc/hadoop/conf) should avoid setting the ENV variable.

      This is blocking us to onboard spark from oozie or falcon. Thanks.

      Attachments

        Activity

          People

            Unassigned Unassigned
            shaik.idris Shaik Idris Ali
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: