Description
In 3.1.0, with stateMachineCache enabled, the RaftLogCache entries contain a reference to the original RaftClientRequest. This is not supposed to happen as RaftLogCache entries should only refer to the LogEntries with data truncated, and RaftLogCache retention policy only counts the size of the entries without data.
This problem impacts Apache Ozone. The reference form RaftLogCache entries prevent the original RaftClientRequest (which contains a large data chunk) to be GCed. The result is Ozone datanodes quickly run out of heap memory.
This is not the case with the latest master branch, only with the 3.1.0 release.
The fix for this issue in 3.1.0 is as simple as 6a141544c567a6325b05e2972cd426cdc14060cb.
Attachments
Attachments
Issue Links
- fixes
-
HDDS-11317 Key put failed for large file sizes
- Open
- is caused by
-
RATIS-1983 Refactor client request processing to support reference count
- Resolved
- is duplicated by
-
RATIS-2142 OOM for stateMachineCache use cases
- Resolved