Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
2.7.0, 2.8.0
-
None
-
None
Description
Page::read is using a readonly mmap to read paged messages:
if the OS mapped regions accessed are not into the OS page cache it can cause several major page faults that would lead to suffer very long time to safepoint pauses (it can be seen by enabling -XX:+PrintGCApplicationStoppedTime).
Such pauses can delay significantly the GC work in a way similar to long Stop-Of-The-World pauses, blocking the broker long enough that any connected client will consider it dead or making the broker itself to suicide by shutdown.
The original proposal to use mmap read has been used to avoid Page::read to allocate big direct ByteBuffers just to read entirely the paged messages from the filesystem: implementing chunked reading of those files while re-using the read ByteBuffer would allow to reduce the number of syscalls to read the file, avoiding the long time to safepoint pauses too.
Any OS pauses on JNI (ie NIO FileChannel::read) won't cause any safepoint delay (ie JNI calls are IN a safepoint,not between).