[HADOOP-14381] S3AUtils.translateException to map 503 reponse to => throttling failure - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Sub-task
Status: Resolved
Priority: Major
Resolution: Duplicate
Affects Version/s: 2.8.0
Fix Version/s: None
Component/s: fs/s3
Labels:
None

Description

When AWS S3 returns "503", it means that the overall set of requests on a part of an S3 bucket exceeds the permitted limit; the client(s) need to throttle back or away for some rebalancing to complete.

The aws SDK retries 3 times on a 503, but then throws it up. Our code doesn't do anything with that other than create a generic AWSS3IOException.

Proposed

add a new exception, AWSOverloadedException
raise it on a 503 from S3 (& for s3guard, on DDB complaints)
have it include a link to a wiki page on the topic, as well as the path
and any other diags

Code talking to S3 may then be able to catch this and choose to react. Some retry with exponential backoff is the obvious option. Failing, well, that could trigger task reattempts at that part of the query, then job retry —which will again fail, unless the number of tasks run in parallel is reduced

As this throttling is across all clients talking to the same part of a bucket, fixing it is potentially a high level option. We can at least start by reporting things better

Attachments

Issue Links

is duplicated by

HADOOP-13786 Add S3A committers for zero-rename commits to S3 endpoints

Resolved

is part of

HADOOP-13786 Add S3A committers for zero-rename commits to S3 endpoints

Resolved

is related to

HADOOP-14303 Review retry logic on all S3 SDK calls, implement where needed

Resolved

Activity

People

Assignee:: Steve Loughran

Reporter:: Steve Loughran

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 04/May/17 18:37

Updated:: 22/Mar/18 05:16

Resolved:: 22/Mar/18 05:16