XMLWordPrintableJSON

Details

    Description

      A recurring problem is that applications forget to close their input streams; eventually the HTTP connection runs out.

      Having the finalizer close streams during GC will ensure that after a GC the http connections are returned. While this is an improvement on today, it is insufficient

      • only happens during GC, so may not fix problem entirely
      • doesn't let developers know things are going wrong.
      • doesn't let us differentiate well between stream leak and overloaded FS

      proposed enhancements then

      • collect stack trace in constructor
      • log in finalize at warn including path, thread and stack
      • have special log for this, so it can be turned off in production (libraries telling end users off for developer errors is simply an annoyance)

      Leak Reporting

      • the log for leak reporting is org.apache.hadoop.fs.resource.leaks
      • An error message is reported at WARN, including the file name.
      • A stack trace of where the stream was created is reported
        at INFO.
      • A best-effort attempt is made to release any active HTTPS
        connection.
      • The filesystem IOStatistic stream_leaks is incremented.

      The intent is to make it easier to identify where streams
      are being opened and not closed -as these consume resources
      including often HTTPS connections from the connection pool
      of limited size.

      It MUST NOT be relied on as a way to clean up open
      files/streams automatically; some of the normal actions of
      the close() method are omitted.

      Attachments

        Activity

          People

            stevel@apache.org Steve Loughran
            stevel@apache.org Steve Loughran
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: