Details
-
Sub-task
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
Description
*What's the problem ? *
As the image shows, there are 1885 instances of RaftServerImpl, most of them are Closed, and should be GC, but actually not. You can find from the image
1513 RaftServerImpl were held by ManagermentFactory->jxmMBeanServer->HashMap, 372 RaftServerImpl were held by Datanode ReportManager Thread -> prometheus -> HashMap. So 1513 RaftServerImpl leak in ratis, and 372 leak in ozone. If RaftServerImpl can not GC, there are a lot of related resource can not be GC, such as the DirectByteBuffer in SegmentRaftLogWorker, which result 1GB memory leak out of heap.
1. 1885 instances of RaftServerImpl
2. 1513 RaftServerImpl were held by ManagermentFactory->jxmMBeanServer->HashMap, 372 RaftServerImpl were held by Datanode ReportManager Thread -> prometheus -> HashMap
3. 1513 RaftServerImpl were held by ManagermentFactory->jxmMBeanServer->HashMap
4. 372 RaftServerImpl were held by Datanode ReportManager Thread -> prometheus -> HashMap
5. 2038 DirectByteBuffer, and 1885 held by RaftServerImpl.
6. 1033 DirectByteBuffer were held by ManagermentFactory, 802 DirectByteBuffer were held by Datanode ReportManager Thread, total 1885.
7. The reason RaftServerImpl held by ManagermentFactory->jxmMBeanServer->HashMap is ratis start JmxReporter, but does not stop it.
8. The reason RaftServerImpl held by Datanode ReportManager Thread -> prometheus -> HashMap is ozone call the ratis function to register metric in prometheus, but does not unregister it.
Attachments
Attachments
Issue Links
- blocks
-
HDDS-3514 Fix Memory leak of RaftServerImpl
- Resolved