Uploaded image for project: 'Ratis'
  1. Ratis
  2. RATIS-2141

OOM for stateMachineCache use cases

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Resolved
    • 3.1.0
    • 3.1.1
    • server
    • None

    Description

      In 3.1.0, with stateMachineCache enabled, the RaftLogCache entries contain a reference to the original RaftClientRequest. This is not supposed to happen as RaftLogCache entries should only refer to the LogEntries with data truncated, and RaftLogCache retention policy only counts the size of the entries without data.

      This problem impacts Apache Ozone. The reference form RaftLogCache entries prevent the original RaftClientRequest (which contains a large data chunk) to be GCed. The result is Ozone datanodes quickly run out of heap memory.

      This is not the case with the latest master branch, only with the 3.1.0 release.

      The fix for this issue in 3.1.0 is as simple as 6a141544c567a6325b05e2972cd426cdc14060cb.

       

      Attachments

        1. RaftLogCache_entry.png
          124 kB
          Duong
        2. heap-dump.png
          140 kB
          Duong

        Issue Links

          Activity

            People

              Unassigned Unassigned
              duongnguyen Duong
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: