Details
-
Sub-task
-
Status: Open
-
Minor
-
Resolution: Unresolved
-
3.1.1
-
None
-
None
Description
Problem: if an app opens too many input streams, then all the http connections in the S3A pool can be used up; all attempts to do other FS operations fail timing out for http pool access
Proposed simple solution: log better what's going on with input stream lifecyce, specifically
- include URL of file in open, reopen & close events
- maybe: Separate logger for these events, though S3A Input stream should be enough as it doesn't do much else.
- maybe: have some prefix in the events like "Lifecycle", so that you could use the existing log @ debug, grep for that phrase and look at the printed URLs to identify what's going on
- stream metrics: expose some of the state of the http connection pool and/or active input and output streams
Idle output streams don't use up http connections, as they only connect during block upload.
Attachments
Issue Links
- relates to
-
HADOOP-17338 Intermittent S3AInputStream failures: Premature end of Content-Length delimited message body etc
- Resolved