Currently the S3 native implementation org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore requires credentials to be set explicitly. Amazon allows setting credentials for instances instead of users, via roles. Such are rotated frequently and kept in a local cache all of which is handled by the AWS SDK in this case the AmazonS3Client. The SDK follows a specific order to establish whether credentials are set explicitly or via a role:
- Environment Variables: AWS_ACCESS_KEY_ID and AWS_SECRET_KEY
- Java System Properties: aws.accessKeyId and aws.secretKey
- Instance Metadata Service, which provides the credentials associated with the IAM role for the EC2 instance
as seen in http://docs.aws.amazon.com/IAM/latest/UserGuide/role-usecase-ec2app.html
To support this feature the current NativeFileSystemStore implementation needs to be altered to use the AWS SDK instead of the JetS3t S3 libraries.
A request for this feature has previously been raised as part of the Flume project (
FLUME-1691) where the HDFS on top of S3 implementation is used as a manner of logging into S3 via an HDFS Sink.