Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-13971

Fix memory leak in FileSystem.Cache.Key class

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • fs, security
    • None

    Description

      Calling FileSystem#get(final URI uri, final Configuration conf, final String user) multiple times can result in memory leak because of the hash method implementation of UserGroupInformation. FileSystem always instantiates a new FileSystem object despite using the same user name/same URI.

      In the past, other downstream projects work around this bug by either disabling cache (set fs.%s.impl.disable.cache to true) or call FileSystem.closeAllForUGI() to release resource on demand. (See for instance HIVE-3098, YARN-58, TEZ-1585)

      However, neither approach is desirable. The first workaround loses performance because it disables cache. This bug was discussed extensively in HADOOP-12707, but the proposed workaround FileSystem.closeAllForUGI() is insufficient, because it won't purge the objects from cache due to the same hash method implementation bug.

      I would like to file a new jira, knowing that current workarounds do not work, and invite more discussion. An ideal approach is to change UGI hash method, but it may break many downstream applications, so setting target version as 3.0.0-beta

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              weichiu Wei-Chiu Chuang
              Votes:
              0 Vote for this issue
              Watchers:
              19 Start watching this issue

              Dates

                Created:
                Updated: