Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
2.7.5
-
None
-
None
-
Reviewed
Description
When running some unit tests for storm we found that we would occasionally get out of memory errors on the HDFS integration tests.
When I got a heap dump I found that the ShutdownHookManager was full of BlockPoolSlice$1 instances. Which hold a reference to the BlockPoolSlice which then in turn holds a reference to the DataNode etc....
It looks like when shutdown is called on the BlockPoolSlice there is no way to remove the shut down hook in because no reference to it is saved.
Hi revans2, thanks for reporting this issue. I tried to recreate this issue by setting up MiniDFSCluster in a loop. It eventually runs out of heap memory but i don't see it happening due to BlockPoolSlice. (I took 15+ heap dumps on OOM and didn't found single instance of BlockPoolSlice in any of them). However there is genuine problem of OOM when MiniDFSCluster is built and shutdown periodically in loop. In MiniDFSCluster#shutdown we are calling ShutdownHookManager#clearShutdownHooks which removes all the shutdown hooks before they are called by Runtime. I think is not correct as it defeats the purpose of ShutdownHook. I will attach a initial patch for review on this. On bigger issue if OOM in MiniDFSCluster heap dumps shows that 80-90% memory is retained by entries in BlockMap which has references in multiple classes.