Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-16272

Allow configs to reference other configs, env and system properties

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • None
    • 2.1.0
    • Spark Core
    • None

    Description

      Currently, Spark's configuration is static; it is whatever is written to the config file, with some rare exceptions (such as some YARN code that does expansion of Hadoop configuration).

      But there are a few use cases that don't work well in that situation. For example, consider spark.sql.hive.metastore.jars. It references a list of paths containing the classpath for accessing Hive's metastore. If you're launching an application in cluster mode, it means that whatever is in the configuration of the edge node needs to match the configuration of the random node in the cluster where the driver will actually run.

      This would be easily solved if there was a way to reference system properties or env variables; for example, when YARN launches a container, a bunch of env variables are set, which could be used to modify that path to match the correct location on the node.

      So I'm proposing a change where config properties can opt-in to use this variable expansion feature; it's opt-in to avoid breaking existing code (who knows) and to avoid the extra cost of doing the variable expansion of every config read.

      Attachments

        Activity

          People

            vanzin Marcelo Masiero Vanzin
            vanzin Marcelo Masiero Vanzin
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: