Details
-
Bug
-
Status: Closed
-
Critical
-
Resolution: Fixed
-
1.1.2
-
None
Description
When a job is completed, closeAllForUGI is called to close all the cached FileSystems in the FileSystem cache. However, the CleanupQueue may run after this occurs and call FileSystem.get() to delete the staging directory, adding a FileSystem to the cache that will never be closed.
People on the user-list have reported this causing their JobTrackers to OOME every two weeks.
Attachments
Attachments
- MAPREDUCE-5351.patch
- 3 kB
- Sandy Ryza
- MAPREDUCE-5351-1.patch
- 6 kB
- Sandy Ryza
- MAPREDUCE-5351-2.patch
- 6 kB
- Sandy Ryza
- MAPREDUCE-5351-addendum.patch
- 0.6 kB
- Sandy Ryza
- MAPREDUCE-5351-addendum-1.patch
- 2 kB
- Sandy Ryza
- JobInProgress_JobHistory.patch
- 3 kB
- jandyfish
Issue Links
- relates to
-
MAPREDUCE-5508 JobTracker memory leak caused by unreleased FileSystem objects in JobInProgress#cleanupJob
- Resolved
Activity
The changes in CleanupQueue.java, wouldn't be enough to close the FS instance got there instead doing a closeAll?
On how to test it, could we temporary (for testing) instrument the FS cache size and monitor it does grow over time?
sandyr Agree with tucu00... can you please help me understand the motivation for changes in JIP?
It would be nice to put this into 1.2.1, I believe Matt is trying to get it done over the weekend... any chance we can expedite this?
The motivation for the changes in JIP is that we don't want to close the FileSystem and then immediately reopen it from the CleanupQueue code. Just closing the FS instance (not doing closeAllForUGI) should be fine.
I'll upload a patch today or tomorrow.
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12590103/MAPREDUCE-5351-1.patch
against trunk revision .
-1 patch. The patch command could not apply the patch.
Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3811//console
This message is automatically generated.
sandyr Thanks for the quick turnaround.
One nit - you should, at least, mark the new api in FS as @Private.
Thanks for looking it over. Uploading a patch that marks it as private.
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12590105/MAPREDUCE-5351-2.patch
against trunk revision .
-1 patch. The patch command could not apply the patch.
Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3812//console
This message is automatically generated.
I just committed this. Thanks sandyr!
tucu00 - Sorry, I didn't wait longer to unblock Matt for the release. Please file a follow-on if you have any concernse. Thanks, and sorry again!
Reopening as with this fix we are seeing jobs fail with the following exception
13/07/02 16:06:57 DEBUG mapred.JobClient: Printing tokens for job: job_201307020820_0012 13/07/02 16:06:57 DEBUG ipc.Client: IPC Client (47) connection to host/ip:50300 from hortonar sending #32 13/07/02 16:06:57 DEBUG ipc.Client: IPC Client (47) connection to host/ip:50300 from hortonar got value #32 13/07/02 16:06:57 DEBUG retry.RetryUtils: RETRY 0) policy=TryOnceThenFail, exception=org.apache.hadoop.ipc.RemoteException: java.io.IOException: Filesystem closed at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:383) at org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:1633) at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:364) at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1166) at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:350) at org.apache.hadoop.mapred.JobTracker.submitJob(JobTracker.java:3599) at org.apache.hadoop.mapred.JobTracker.submitJob(JobTracker.java:3561) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:587) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1444) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1440) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1438) 13/07/02 16:06:57 INFO mapred.JobClient: Cleaning up the staging area hdfs://host:8020/user/hortonar/.staging/job_201307020820_0012 13/07/02 16:06:57 ERROR security.UserGroupInformation: PriviledgedActionException as:hortonar cause:org.apache.hadoop.ipc.RemoteException: java.io.IOException: Filesystem closed at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:383) at org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:1633) at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:364) at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1166) at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:350) at org.apache.hadoop.mapred.JobTracker.submitJob(JobTracker.java:3599) at org.apache.hadoop.mapred.JobTracker.submitJob(JobTracker.java:3561) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:587) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1444) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1440) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1438) org.apache.hadoop.ipc.RemoteException: java.io.IOException: Filesystem closed at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:383) at org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:1633) at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:364) at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1166) at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:350) at org.apache.hadoop.mapred.JobTracker.submitJob(JobTracker.java:3599) at org.apache.hadoop.mapred.JobTracker.submitJob(JobTracker.java:3561) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:587) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1444) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1440) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396)
Its happening very frequently that quite a few mr jobs failed because of this exception. Various example jobs, jobs through pig, oozie etc also failed because of this.
Attaching an addendum patch that should fix the issue. I was able to reproduce the issue by running pi jobs repeatedly. With the patch, the exception no longer occurs.
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12590494/MAPREDUCE-5351-addendum.patch
against trunk revision .
-1 patch. The patch command could not apply the patch.
Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3822//console
This message is automatically generated.
sandyr - JIP.cleanupJob always results in PathDeletionContext with ugi being null... how does this fix the original problem? I'm missing something? Tx.
In JIP.cleanupJob:
Path tempDir = jobtracker.getSystemDirectoryForJob(getJobID()); CleanupQueue.getInstance().addToQueue( new PathDeletionContext(tempDir, conf)); // delete the staging area for the job and cancel delegation token String jobTempDir = conf.get("mapreduce.job.dir"); if (jobTempDir != null && conf.getKeepTaskFilesPattern() == null && !conf.getKeepFailedTaskFiles()) { Path jobTempDirPath = new Path(jobTempDir); tempDirFs = jobTempDirPath.getFileSystem(conf); CleanupQueue.getInstance().addToQueue( new PathDeletionContext(jobTempDirPath, conf, userUGI, jobId)); }
The CleanupQueue is used twice, once with the UGI set and once without.
Duh, good point.
Can you please add a comment to the addendum patch explaining the rationale for the check? Also, a test case specifically covering this bug would be good. Thanks!
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12590672/MAPREDUCE-5351-addendum-1.patch
against trunk revision .
-1 patch. The patch command could not apply the patch.
Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3826//console
This message is automatically generated.
The original patch as well as the addendum look good to me. The new test looks good too. +1 for pushing the addendum.
Both patch and addendum patch have been applied by Arun, to both branch-1.2 and branch-1. Resolving as Fixed. Thanks, Sandy!
find a bug :
open a filesystem when JobHistory create userLogFile,but this filesystem instance will not be close,and it still cache in FileSystem.cache。
i comminited this svn patch JobInProcess_JobHistory.path。i hope it can help.
find a bug :
open a filesystem when JobHistory create userLogFile,but this filesystem instance will not be close,and it still cache in FileSystem.cache。
i comminited this svn patch JobInProcess_JobHistory.path。i hope it can help.
Hi, jandyfish. Thank you for reporting this and posting a patch.
Have you seen the related issue MAPREDUCE-5508? I believe that fixes the bug you saw. The MAPREDUCE-5508 patch has already been reviewed and committed. We tested it at scale (hundreds of nodes/thousands of job executions) and saw no occurrences of the leak.
Hi,Chris Nauroth.
When I solve this problem, do not find MAPREDUCE-5508 patch. And then submit patch when not paying attention
The MAPREDUCE-5508 path is perfect and work well 。Thank you.
Hi Chris/Jandy,
Do I need to update the JobInProgress_JobHistory.patch also to resolve this issue, because I didn't face OOME issue after updating the MAPREDUCE-5508 but Jobtracker hangs after a 7 to 8 days.
Please help.
Thanks,
Viswa.J
No, AFAIK the JobInProgress_JobHistory.patch is not required, and this leak has been fixed by the patches already committed to svn for MAPREDUCE-5351 and MAPREDUCE-5508. If you're seeing a hang, but not a memory leak, then this sounds like a different issue. For example, MAPREDUCE-5606 is another known bug that can cause a JT to hang.
Hi Chris,
For that JT hangs, you have asked to disable the hadoop user history
location. So doing this will solve the problem. What exactly happens if I
set this value to NONE. By setting this value, JT heap memory will not
increase as much like now.
Also I'm using the version 1.2.1, when the JT memory reaches 6.68/8.89 GB
it start hangs every ten days. Not able to submit the job and UI is not
loading at all.
Please help.
Hi Chris,
JT memory reaches 6.68/8.89 GB and not able to submit the job and UI is not
loading at all. But didn't see any JT OOM exceptions.
Have taken the thread dump of Jobtracker, and the JT thread dump as follows:
Deadlock Detection:
Can't print deadlocks:null
Thread 25817: (state = BLOCKED)
- java.lang.Thread.sleep(long) @bci=0 (Compiled frame; information may be imprecise)
- org.apache.hadoop.hdfs.LeaseRenewer.run(int) @bci=274, line=397 (Compiled frame)
- org.apache.hadoop.hdfs.LeaseRenewer.access$600(org.apache.hadoop.hdfs.LeaseRenewer, int) @bci=2, line=69 (Interpreted frame)
- org.apache.hadoop.hdfs.LeaseRenewer$1.run() @bci=8, line=273 (Interpreted frame)
- java.lang.Thread.run() @bci=11, line=662 (Interpreted frame)
Locked ownable synchronizers:
- None
Thread 25815: (state = BLOCKED)
- java.lang.Object.wait(long) @bci=0 (Compiled frame; information may be imprecise)
- org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run() @bci=245, line=3000 (Compiled frame)
Locked ownable synchronizers:
- None
Thread 25813: (state = BLOCKED)
- java.lang.Object.wait(long) @bci=0 (Compiled frame; information may be imprecise)
- org.apache.hadoop.ipc.Client$Connection.waitForWork() @bci=59, line=747 (Compiled frame)
- org.apache.hadoop.ipc.Client$Connection.run() @bci=55, line=789 (Compiled frame)
Locked ownable synchronizers:
- None
Thread 25812: (state = BLOCKED)
- java.lang.Object.wait(long) @bci=0 (Compiled frame; information may be imprecise)
- org.apache.hadoop.ipc.Client$Connection.waitForWork() @bci=59, line=747 (Compiled frame)
- org.apache.hadoop.ipc.Client$Connection.run() @bci=55, line=789 (Compiled frame)
Locked ownable synchronizers:
- None
Thread 25790: (state = BLOCKED)
- java.lang.Object.wait(long) @bci=0 (Compiled frame; information may be imprecise)
Thread 25788: (state = BLOCKED)
- java.lang.Object.wait(long) @bci=0 (Compiled frame; information may be imprecise)
- org.apache.hadoop.ipc.Client$Connection.waitForWork() @bci=59, line=747 (Compiled frame)
- org.apache.hadoop.ipc.Client$Connection.run() @bci=55, line=789 (Compiled frame)
Locked ownable synchronizers:
- None
Thread 25786: (state = BLOCKED)
- java.lang.Object.wait(long) @bci=0 (Compiled frame; information may be imprecise)
- org.apache.hadoop.ipc.Client$Connection.waitForWork() @bci=59, line=747 (Compiled frame)
- org.apache.hadoop.ipc.Client$Connection.run() @bci=55, line=789 (Compiled frame)
Locked ownable synchronizers:
- None
Thread 25761: (state = BLOCKED)
- sun.nio.ch.EPollArrayWrapper.epollWait(long, int, long, int) @bci=0 (Compiled frame; information may be imprecise)
- sun.nio.ch.EPollArrayWrapper.poll(long) @bci=18, line=210 (Compiled frame)
- sun.nio.ch.EPollSelectorImpl.doSelect(long) @bci=28, line=65 (Compiled frame)
- sun.nio.ch.SelectorImpl.lockAndDoSelect(long) @bci=37, line=69 (Compiled frame)
- sun.nio.ch.SelectorImpl.select(long) @bci=30, line=80 (Interpreted frame)
- org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(java.nio.channels.SelectableChannel, int, long) @bci=46, line=332 (Interpreted frame)
- org.apache.hadoop.net.SocketIOWithTimeout.doIO(java.nio.ByteBuffer, int) @bci=80, line=157 (Compiled frame)
- org.apache.hadoop.net.SocketInputStream.read(java.nio.ByteBuffer) @bci=6, line=155 (Compiled frame)
- org.apache.hadoop.net.SocketInputStream.read(byte[], int, int) @bci=7, line=128 (Compiled frame)
- java.io.FilterInputStream.read(byte[], int, int) @bci=7, line=116 (Interpreted frame)
- org.apache.hadoop.ipc.Client$Connection$PingInputStream.read(byte[], int, int) @bci=4, line=364 (Interpreted frame)
- java.io.BufferedInputStream.fill() @bci=175, line=218 (Compiled frame)
- java.io.BufferedInputStream.read() @bci=12, line=237 (Compiled frame)
- java.io.DataInputStream.readInt() @bci=4, line=370 (Compiled frame)
- org.apache.hadoop.ipc.Client$Connection.receiveResponse() @bci=19, line=845 (Compiled frame)
- org.apache.hadoop.ipc.Client$Connection.run() @bci=62, line=790 (Compiled frame)
-------------------------------------------------------------------------------------------------------------------------------
And Jobtracker heap summary as follows:
using thread-local object allocation.
Parallel GC with 10 thread(s)
Heap Configuration:
MinHeapFreeRatio = 40
MaxHeapFreeRatio = 70
MaxHeapSize = 10737418240 (10240.0MB)
NewSize = 1310720 (1.25MB)
MaxNewSize = 17592186044415 MB
OldSize = 5439488 (5.1875MB)
NewRatio = 2
SurvivorRatio = 8
PermSize = 21757952 (20.75MB)
MaxPermSize = 85983232 (82.0MB)
Heap Usage:
PS Young Generation
Eden Space:
capacity = 6488064 (6.1875MB)
used = 6488064 (6.1875MB)
free = 0 (0.0MB)
100.0% used
From Space:
capacity = 9764864 (9.3125MB)
used = 0 (0.0MB)
free = 9764864 (9.3125MB)
0.0% used
To Space:
capacity = 9764864 (9.3125MB)
used = 0 (0.0MB)
free = 9764864 (9.3125MB)
0.0% used
PS Old Generation
capacity = 7158300672 (6826.6875MB)
used = 7158240200 (6826.629829406738MB)
free = 60472 (0.05767059326171875MB)
99.99915521849708% used
PS Perm Generation
capacity = 26738688 (25.5MB)
used = 26428648 (25.204322814941406MB)
free = 310040 (0.29567718505859375MB)
98.8404816272212% used
Please help. It affects our production system.
Thanks,
Viswa
Uploaded a patch that should fix the issue, but haven't yet tried to verify it. I'm not sure how to write a test for this.