Hadoop 2.6-2.7 uses the full amazon-aws-sdk JAR. Hadoop 2.8+ has switched to the amazon-s3-sdk jar because it was lighter weight.
I want to return to the full JAR before 2.8 switches, for
- downstream code: if someone is already including/depending-on/upgrading the aws SDK, switching to the s3 sdk complicates packaging, distribution. If directly depended on via maven dependencies, it breaks the build
- some of the 2.8+ patches, e.g.
HADOOP-12537, have to add another part of the S3 SDK to handle temporary credentials. This will make life even more complex downstream
- if the hadoop-aws module ever adds more stuff (e.g. a s3mper style use of dynamo db for directory structure storage), then again, more JARs, more complexity.
Let's just change the build to return to the original JAR. Yes it is heavy, but it will be a consistent heaviness for all projects downstream.
This change must go in to 2.8 if we don't want to start breaking things