[HADOOP-13868] New defaults for S3A multi-part configuration - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 2.7.0, 3.0.0-alpha1
Fix Version/s: None
Component/s: fs/s3
Labels:
None

Description

I've been looking at a big performance regression when writing to S3 from Spark that appears to have been introduced with ~~HADOOP-12891~~.

In the Amazon SDK, the default threshold for multi-part copies is 320x the threshold for multi-part uploads (and the block size is 20x bigger), so I don't think it's necessarily wise for us to have them be the same.

I did some quick tests and it seems to me the sweet spot when multi-part copies start being faster is around 512MB. It wasn't as significant, but using 104857600 (Amazon's default) for the blocksize was also slightly better.

I propose we do the following, although they're independent decisions:

(1) Split the configuration. Ideally, I'd like to have fs.s3a.multipart.copy.threshold and fs.s3a.multipart.upload.threshold (and corresponding properties for the block size). But then there's the question of what to do with the existing fs.s3a.multipart.* properties. Deprecation? Leave it as a short-hand for configuring both (that's overridden by the more specific properties?).

(2) Consider increasing the default values. In my tests, 256 MB seemed to be where multipart uploads came into their own, and 512 MB was where multipart copies started outperforming the alternative. Would be interested to hear what other people have seen.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

HADOOP-13868.002.patch
09/Dec/16 14:56
3 kB
Sean Mackrory
optimizing-multipart-s3a.sh
08/Dec/16 22:50
2 kB
Sean Mackrory
HADOOP-13868.001.patch
08/Dec/16 22:50
3 kB
Sean Mackrory

Issue Links

relates to

HADOOP-12891 S3AFileSystem should configure Multipart Copy threshold and chunk size

Resolved

links to

GitHub Pull Request #1125

Activity

People

Assignee:: Sean Mackrory

Reporter:: Sean Mackrory

Votes:: 0 Vote for this issue

Watchers:: 6 Start watching this issue

Dates

Created:: 06/Dec/16 17:18

Updated:: 13/Aug/19 19:29

Resolved:: 13/Aug/19 19:29