[HADOOP-10400] Incorporate new S3A FileSystem implementation - ASF JIRA

Details

Type: New Feature
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: 2.4.0
Fix Version/s: 2.6.0
Component/s: fs, fs/s3
Labels:
None

Target Version/s:

2.6.0
Hadoop Flags:

Reviewed

Description

The s3native filesystem has a number of limitations (some of which were recently fixed by ~~HADOOP-9454~~). This patch adds an s3a filesystem which uses the aws-sdk instead of the jets3t library. There are a number of improvements over s3native including:

Parallel copy (rename) support (dramatically speeds up commits on large files)
AWS S3 explorer compatible empty directories files "xyz/" instead of "xyz_$folder$" (reduces littering)
Ignores s3native created _$folder$ files created by s3native and other S3 browsing utilities
Supports multiple output buffer dirs to even out IO when uploading files
Supports IAM role-based authentication
Allows setting a default canned ACL for uploads (public, private, etc.)
Better error recovery handling
Should handle input seeks without having to download the whole file (used for splits a lot)

This code is a copy of https://github.com/Aloisius/hadoop-s3a with patches to various pom files to get it to build against trunk. I've been using 0.0.1 in production with CDH 4 for several months and CDH 5 for a few days. The version here is 0.0.2 which changes around some keys to hopefully bring the key name style more inline with the rest of hadoop 2.x.

Tunable parameters:

fs.s3a.access.key - Your AWS access key ID (omit for role authentication)
fs.s3a.secret.key - Your AWS secret key (omit for role authentication)
fs.s3a.connection.maximum - Controls how many parallel connections HttpClient spawns (default: 15)
fs.s3a.connection.ssl.enabled - Enables or disables SSL connections to S3 (default: true)
fs.s3a.attempts.maximum - How many times we should retry commands on transient errors (default: 10)
fs.s3a.connection.timeout - Socket connect timeout (default: 5000)
fs.s3a.paging.maximum - How many keys to request from S3 when doing directory listings at a time (default: 5000)
fs.s3a.multipart.size - How big (in bytes) to split a upload or copy operation up into (default: 104857600)
fs.s3a.multipart.threshold - Until a file is this large (in bytes), use non-parallel upload (default: 2147483647)
fs.s3a.acl.default - Set a canned ACL on newly created/copied objects (private | public-read | public-read-write | authenticated-read | log-delivery-write | bucket-owner-read | bucket-owner-full-control)
fs.s3a.multipart.purge - True if you want to purge existing multipart uploads that may not have been completed/aborted correctly (default: false)
fs.s3a.multipart.purge.age - Minimum age in seconds of multipart uploads to purge (default: 86400)
fs.s3a.buffer.dir - Comma separated list of directories that will be used to buffer file writes out of (default: uses ${hadoop.tmp.dir}/s3a )

Caveats:

Hadoop uses a standard output committer which uploads files as filename.COPYING before renaming them. This can cause unnecessary performance issues with S3 because it does not have a rename operation and S3 already verifies uploads against an md5 that the driver sets on the upload request. While this FileSystem should be significantly faster than the built-in s3native driver because of parallel copy support, you may want to consider setting a null output committer on our jobs to further improve performance.

Because S3 requires the file length and MD5 to be known before a file is uploaded, all output is buffered out to a temporary file first similar to the s3native driver.

Due to the lack of native rename() for S3, renaming extremely large files or directories make take a while. Unfortunately, there is no way to notify hadoop that progress is still being made for rename operations, so your job may time out unless you increase the task timeout.

This driver will fully ignore _$folder$ files. This was necessary so that it could interoperate with repositories that have had the s3native driver used on them, but means that it won't recognize empty directories that s3native has been used on.

Statistics for the filesystem may be calculated differently than the s3native filesystem. When uploading a file, we do not count writing the temporary file on the local filesystem towards the local filesystem's written bytes count. When renaming files, we do not count the S3->S3 copy as read or write operations. Unlike the s3native driver, we only count bytes written when we start the upload (as opposed to the write calls to the temporary local file). The driver also counts read & write ops, but they are done mostly to keep from timing out on large s3 operations.

The AWS SDK unfortunately passes the multipart threshold as an int which means
fs.s3a.multipart.threshold can not be greater than 2^31-1 (2147483647).

This is currently implemented as a FileSystem and not a AbstractFileSystem.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

HADOOP-10400-1.patch
11/Mar/14 02:10
74 kB
Jordan Mendelson
HADOOP-10400-2.patch
11/Mar/14 03:24
74 kB
Jordan Mendelson
HADOOP-10400-3.patch
11/Mar/14 04:14
75 kB
Jordan Mendelson
HADOOP-10400-4.patch
11/Mar/14 06:58
75 kB
Jordan Mendelson
HADOOP-10400-5.patch
19/Mar/14 23:54
75 kB
Jordan Mendelson
HADOOP-10400-6.patch
01/Jul/14 15:21
75 kB
Matteo Bertozzi
HADOOP-10400-7.patch
11/Sep/14 10:09
68 kB
David S. Wang
HADOOP-10400-8.patch
13/Sep/14 00:11
97 kB
David S. Wang
HADOOP-10400-8-branch-2.patch
13/Sep/14 15:01
96 kB
David S. Wang
HADOOP-10400-branch-2.patch
11/Sep/14 18:17
67 kB
David S. Wang

Issue Links

blocks

HADOOP-10676 S3AOutputStream not reading new config knobs for multipart configs

Closed

HADOOP-10677 ExportSnapshot fails on kerberized cluster using s3a

Closed

HADOOP-10675 Add server-side encryption functionality to s3a

Closed

depends upon

HADOOP-10714 AmazonS3Client.deleteObjects() need to be limited to 1000 entries per call

Closed

HADOOP-9361 Strictly define the expected behavior of filesystem APIs and write tests to verify compliance

Closed

HADOOP-10373 create tools/hadoop-amazon for aws/EMR support

Closed

is depended upon by

HADOOP-11262 Enable YARN to use S3A

Resolved

HADOOP-11183 Memory-based S3AOutputstream

Closed

HADOOP-11091 Eliminate old configuration parameter names from s3a

Closed

HADOOP-11171 Enable using a proxy server to connect to S3a.

Closed

HADOOP-11261 Set custom endpoint for S3A

Closed

is duplicated by

HADOOP-13277 Need To Support IAM role based access for supporting Amazon S3

Resolved

is related to

HADOOP-13402 S3A should allow renaming to a pre-existing destination directory to move the source path under that directory, similar to HDFS.

Resolved

HADOOP-11571 Über-jira: S3a stabilisation phase I

Closed

relates to

HADOOP-9680 Extend S3FS and S3NativeFS to work with AWS IAM Temporary Security Credentials

Resolved

HADOOP-9454 Support multipart uploads for s3native

Closed

HADOOP-9384 Update S3 native fs implementation to use AWS SDK to support authorization through roles

Closed

(1 depends upon, 5 is depended upon by, 1 is duplicated by, 2 is related to, 3 relates to)

Incorporate new S3A FileSystem implementation

Details

Description

Attachments

Attachments

Issue Links

Activity

People

Dates