Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-19739

SparkHadoopUtil.appendS3AndSparkHadoopConfigurations to propagate full set of AWS env vars

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 2.1.0
    • 2.2.0
    • Spark Core
    • None

    Description

      SparkHadoopUtil.appendS3AndSparkHadoopConfigurations() propagates the AWS user and secret key to s3n and s3a config options, so getting secrets from the user to the cluster, if set.

      AWS also supports session authentication (env var AWS_SESSION_TOKEN) and region endpoints AWS_DEFAULT_REGION, the latter being critical if you want to address V4-auth-only endpoints like frankfurt and Seol.

      These env vars should be picked up and passed down to S3a too. 4+ lines of code, though impossible to test unless the existing code is refactored to take the env var map[String, String], so allowing a test suite to set the values in itds own map.

      side issue: what if only half the env vars are set and users are trying to understand why auth is failing? It may be good to build up a string identifying which env vars had their value propagate, and log that @ debug, while not logging the values, obviously.

      Attachments

        Activity

          People

            uncleGen Genmao Yu
            stevel@apache.org Steve Loughran
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: