[HIVE-3098] Memory leak from large number of FileSystem instances in FileSystem.CACHE - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: 0.9.0
Fix Version/s: 0.9.1, 0.10.0
Component/s: Shims
Labels:
None
Environment:

Running with Hadoop 20.205.0.3+ / 1.0.x with security turned on.

Description

The problem manifested from stress-testing HCatalog 0.4.1 (as part of testing the Oracle backend).

The HCatalog server ran out of memory (-Xmx2048m) when pounded by 60-threads, in under 24 hours. The heap-dump indicates that hadoop::FileSystem.CACHE had 1000000 instances of FileSystem, whose combined retained-mem consumed the entire heap.

It boiled down to hadoop::UserGroupInformation::equals() being implemented such that the "Subject" member is compared for equality ("=="), and not equivalence (".equals()"). This causes equivalent UGI instances to compare as unequal, and causes a new FileSystem instance to be created and cached.

The UGI.equals() is so implemented, incidentally, as a fix for yet another problem (~~HADOOP-6670~~); so it is unlikely that that implementation can be modified.

The solution for this is to check for UGI equivalence in HCatalog (i.e. in the Hive metastore), using an cache for UGI instances in the shims.

I have a patch to fix this. I'll upload it shortly. I just ran an overnight test to confirm that the memory-leak has been arrested.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

hive-3098.patch
06/Sep/12 18:48
5 kB
Mithun Radhakrishnan
Hive-3098_(FS_closeAllForUGI()).patch
01/Aug/12 18:28
2 kB
Mithun Radhakrishnan
Hive_3098.patch
18/Jul/12 21:28
7 kB
Mithun Radhakrishnan

Issue Links

is related to

HIVE-9234 HiveServer2 leaks FileSystem objects in FileSystem.CACHE

Closed

relates to

HDFS-3545 DFSClient leak due to malfunctioning of FileSystem Cache

Open

HIVE-4501 HS2 memory leak - FileSystem objects in FileSystem.CACHE

Resolved

HIVE-5296 Memory leak: OOM Error after multiple open/closed JDBC connections.

Closed

HADOOP-17214 Allow file system caching to be disabled for all file systems

Open

HDFS-3513 HttpFS should cache filesystems

Closed

(1 relates to)

Activity

People

Assignee:: Mithun Radhakrishnan

Reporter:: Mithun Radhakrishnan

Votes:: 0 Vote for this issue

Watchers:: 18 Start watching this issue

Dates

Created:: 06/Jun/12 18:34

Updated:: 03/Sep/20 05:13

Resolved:: 07/Sep/12 14:23