Uploaded image for project: 'Hadoop YARN'
  1. Hadoop YARN
  2. YARN-11397

Memory leak when reading aggregated logs from s3 (LogAggregationTFileController::readAggregatedLogs)

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Critical
    • Resolution: Unresolved
    • 3.2.2
    • None
    • log-aggregation
    • None
    • Remote logs dir on s3.

    Description

      Reproduction code in the attachment.

      When collecting aggregated logs from s3 in a loop (see reproduction code) we can easily see that the number of 'S3AInstrumentation' is increasing although the number of 'S3AFileSystem' is not increasing. It means that 'S3AInstrumentation' is not released together with 'S3AFileSystem' as it should be. The root cause of this seems to be the missing close on S3AFileSystem.

      The issue seems similar to https://issues.apache.org/jira/browse/YARN-11039 but the issue is a 'memory leak' (not a 'thread leak') and affected version is earlier here (3.2.2).

      Attachments

        1. YarnLogsS3Issue.scala
          2 kB
          Maciej Smolenski

        Issue Links

          Activity

            People

              Unassigned Unassigned
              maciejsmolenski Maciej Smolenski
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated: