Details
-
Sub-task
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
3.4.1
Description
A recurring problem is that applications forget to close their input streams; eventually the HTTP connection runs out.
Having the finalizer close streams during GC will ensure that after a GC the http connections are returned. While this is an improvement on today, it is insufficient
- only happens during GC, so may not fix problem entirely
- doesn't let developers know things are going wrong.
- doesn't let us differentiate well between stream leak and overloaded FS
proposed enhancements then
- collect stack trace in constructor
- log in finalize at warn including path, thread and stack
- have special log for this, so it can be turned off in production (libraries telling end users off for developer errors is simply an annoyance)
Leak Reporting
- the log for leak reporting is org.apache.hadoop.fs.resource.leaks
- An error message is reported at WARN, including the file name.
- A stack trace of where the stream was created is reported
at INFO. - A best-effort attempt is made to release any active HTTPS
connection. - The filesystem IOStatistic stream_leaks is incremented.
The intent is to make it easier to identify where streams
are being opened and not closed -as these consume resources
including often HTTPS connections from the connection pool
of limited size.
It MUST NOT be relied on as a way to clean up open
files/streams automatically; some of the normal actions of
the close() method are omitted.