Description
One of my tserver, totally 128G memory, gflags:
-memory_limit_hard_bytes=107374182475 (100G) -memory_limit_soft_percentage=85 -memory_pressure_percentage=80
Memory used about 95%, "top" result like:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 8359 work 20 0 0.326t 0.116t 81780 S 727.9 94.6 230228:10 kudu_tablet_ser
That is kudu_tablet_server process used about 116G memory.
On mem-trackers page, I find the "Total consumption" value is about 65G, much lower than 116G.
Then, I login to the server and read code to check any free memory MM operations are work correctly. Unfortunatly, the memory pressure detect function(process_memory::UnderMemoryPressure) doesn't report it's under pressure, because the tcmalloc function GetNumericProperty(const char* property, size_t* value) with parameter "generic.current_allocated_bytes" doesn't return the memory as the memory use reported by the OS.
https://gperftools.github.io/gperftools/tcmalloc.html
generic.current_allocated_bytes Number of bytes used by the application. This will not typically match the memory use reported by the OS, because it does not include TCMalloc overhead or memory fragmentation.
This situation may lead to OPs prefer to free memory could not be scheduled promptly, and the OS memory may consumed empty, and then kill tserver because of OOM.