[RATIS-845] Memory leak of RaftServerImpl for no unregister from reporter - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Sub-task
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 1.0.0
Component/s: None
Labels:
None

Description

*What's the problem ? *
As the image shows, there are 1885 instances of RaftServerImpl, most of them are Closed, and should be GC, but actually not. You can find from the image
1513 RaftServerImpl were held by ManagermentFactory->jxmMBeanServer->HashMap, 372 RaftServerImpl were held by Datanode ReportManager Thread -> prometheus -> HashMap. So 1513 RaftServerImpl leak in ratis, and 372 leak in ozone. If RaftServerImpl can not GC, there are a lot of related resource can not be GC, such as the DirectByteBuffer in SegmentRaftLogWorker, which result 1GB memory leak out of heap.

1. 1885 instances of RaftServerImpl

2. 1513 RaftServerImpl were held by ManagermentFactory->jxmMBeanServer->HashMap, 372 RaftServerImpl were held by Datanode ReportManager Thread -> prometheus -> HashMap

3. 1513 RaftServerImpl were held by ManagermentFactory->jxmMBeanServer->HashMap

4. 372 RaftServerImpl were held by Datanode ReportManager Thread -> prometheus -> HashMap

5. 2038 DirectByteBuffer, and 1885 held by RaftServerImpl.

6. 1033 DirectByteBuffer were held by ManagermentFactory, 802 DirectByteBuffer were held by Datanode ReportManager Thread, total 1885.

7. The reason RaftServerImpl held by ManagermentFactory->jxmMBeanServer->HashMap is ratis start JmxReporter, but does not stop it.

8. The reason RaftServerImpl held by Datanode ReportManager Thread -> prometheus -> HashMap is ozone call the ratis function to register metric in prometheus, but does not unregister it.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

screenshot-9.png
28/Apr/20 05:38
69 kB
runzhiwang
screenshot-8.png
28/Apr/20 05:35
9 kB
runzhiwang
screenshot-7.png
28/Apr/20 03:45
113 kB
runzhiwang
screenshot-6.png
28/Apr/20 03:43
108 kB
runzhiwang
screenshot-5.png
28/Apr/20 03:41
15 kB
runzhiwang
screenshot-4.png
28/Apr/20 03:38
23 kB
runzhiwang
screenshot-3.png
07/Apr/20 03:55
40 kB
runzhiwang
screenshot-2.png
07/Apr/20 03:54
171 kB
runzhiwang
screenshot-10.png
28/Apr/20 05:42
15 kB
runzhiwang

Issue Links

blocks

HDDS-3514 Fix Memory leak of RaftServerImpl

Resolved

Activity

People

Assignee:: runzhiwang

Reporter:: runzhiwang

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 07/Apr/20 03:50

Updated:: 15/May/20 09:05

Resolved:: 15/May/20 09:05

Time Tracking

Estimated:

Not Specified

Remaining:

Logged:

4h 40m