Uploaded image for project: 'Ratis'
  1. Ratis
  2. RATIS-926 Memory leak
  3. RATIS-845

Memory leak of RaftServerImpl for no unregister from reporter

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 1.0.0
    • None
    • None

    Description

      *What's the problem ? *
      As the image shows, there are 1885 instances of RaftServerImpl, most of them are Closed, and should be GC, but actually not. You can find from the image
      1513 RaftServerImpl were held by ManagermentFactory->jxmMBeanServer->HashMap, 372 RaftServerImpl were held by Datanode ReportManager Thread -> prometheus -> HashMap. So 1513 RaftServerImpl leak in ratis, and 372 leak in ozone. If RaftServerImpl can not GC, there are a lot of related resource can not be GC, such as the DirectByteBuffer in SegmentRaftLogWorker, which result 1GB memory leak out of heap.

      1. 1885 instances of RaftServerImpl

      2. 1513 RaftServerImpl were held by ManagermentFactory->jxmMBeanServer->HashMap, 372 RaftServerImpl were held by Datanode ReportManager Thread -> prometheus -> HashMap

      3. 1513 RaftServerImpl were held by ManagermentFactory->jxmMBeanServer->HashMap

      4. 372 RaftServerImpl were held by Datanode ReportManager Thread -> prometheus -> HashMap

      5. 2038 DirectByteBuffer, and 1885 held by RaftServerImpl.


      6. 1033 DirectByteBuffer were held by ManagermentFactory, 802 DirectByteBuffer were held by Datanode ReportManager Thread, total 1885.

      7. The reason RaftServerImpl held by ManagermentFactory->jxmMBeanServer->HashMap is ratis start JmxReporter, but does not stop it.

      8. The reason RaftServerImpl held by Datanode ReportManager Thread -> prometheus -> HashMap is ozone call the ratis function to register metric in prometheus, but does not unregister it.

      Attachments

        1. screenshot-9.png
          69 kB
          runzhiwang
        2. screenshot-8.png
          9 kB
          runzhiwang
        3. screenshot-7.png
          113 kB
          runzhiwang
        4. screenshot-6.png
          108 kB
          runzhiwang
        5. screenshot-5.png
          15 kB
          runzhiwang
        6. screenshot-4.png
          23 kB
          runzhiwang
        7. screenshot-3.png
          40 kB
          runzhiwang
        8. screenshot-2.png
          171 kB
          runzhiwang
        9. screenshot-10.png
          15 kB
          runzhiwang

        Issue Links

          Activity

            People

              yjxxtd runzhiwang
              yjxxtd runzhiwang
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 4h 40m
                  4h 40m