Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-16272

Allow configs to reference other configs, env and system properties



    • New Feature
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • None
    • 2.1.0
    • Spark Core
    • None


      Currently, Spark's configuration is static; it is whatever is written to the config file, with some rare exceptions (such as some YARN code that does expansion of Hadoop configuration).

      But there are a few use cases that don't work well in that situation. For example, consider spark.sql.hive.metastore.jars. It references a list of paths containing the classpath for accessing Hive's metastore. If you're launching an application in cluster mode, it means that whatever is in the configuration of the edge node needs to match the configuration of the random node in the cluster where the driver will actually run.

      This would be easily solved if there was a way to reference system properties or env variables; for example, when YARN launches a container, a bunch of env variables are set, which could be used to modify that path to match the correct location on the node.

      So I'm proposing a change where config properties can opt-in to use this variable expansion feature; it's opt-in to avoid breaking existing code (who knows) and to avoid the extra cost of doing the variable expansion of every config read.




            vanzin Marcelo Masiero Vanzin
            vanzin Marcelo Masiero Vanzin
            0 Vote for this issue
            3 Start watching this issue