[FLINK-19125] Avoid memory fragmentation when running flink docker image - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: 1.11.1, 1.12.0
Fix Version/s: 1.12.0
Component/s: Deployment / Kubernetes, Runtime / State Backends
Labels:
- pull-request-available

Release Note:

Hide
With ~~FLINK-19125~~, jemalloc is adopted as the default memory allocator in Flink's docker image to reduce issues with memory fragmentation. Users can roll back to using glibc by passing the 'disable-jemalloc' flag to the docker-entrypoint.sh script. For more details, please refer to Flink's documentation.

Show
With FLINK-19125 , jemalloc is adopted as the default memory allocator in Flink's docker image to reduce issues with memory fragmentation. Users can roll back to using glibc by passing the 'disable-jemalloc' flag to the docker-entrypoint.sh script. For more details, please refer to Flink's documentation.

Description

This ticket tracks the problem of memory fragmentation when launching default Flink docker image.

In ~~FLINK-18712~~, user reported if he submits job with rocksDB state backend on a k8s session cluster again and again once it finished, the memory usage of task manager grows continuously until OOM killed.
I reproduce this problem with official Flink docker image no matter how we use rocksDB (whether to enable managed memory or not).

I dig into the problem and found this is due to the memory fragmentation caused by glibc, which would not return memory to kernel gracefully (please refer to glibc bugzilla and glibc manual)

I found limiting MALLOC_ARENA_MAX to 2 could mitigate this problem (please refer to choose-for-malloc_arena_max for more details).

And if we choose to use jemalloc to allocate memory via rebuilding another docker image, the problem would be gone.

apt-get -y install libjemalloc-dev

ENV LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libjemalloc.so

Jemalloc intends to emphasize fragmentation avoidance and we might consider to re-factor our Dockerfile to base on jemalloc to avoid memory fragmentation.

Attachments

Issue Links

relates to

FLINK-18712 Flink RocksDB statebackend memory leak issue

Closed

FLINK-20287 Add documentation of how to switch memory allocator in Flink docker image

Resolved

links to

GitHub Pull Request #42

GitHub Pull Request #43

Activity

People

Assignee:: Yun Tang

Reporter:: Yun Tang

Votes:: 2 Vote for this issue

Watchers:: 19 Start watching this issue

Dates

Created:: 02/Sep/20 13:23

Updated:: 04/Jan/21 11:31

Resolved:: 20/Nov/20 15:45