Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-7289

Memory allocation of RocksDB can be problematic in container environments

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Hide
      After FLINK-7289, we could control the memory usage of RocksDB state backend. By default user could set the RocksDB memory boundary through `taskmanager.memory.managed.size` or `taskmanager.memory.managed.fraction`, tune the write/read memory ratio through `state.backend.rocksdb.memory.write-buffer-ratio` (by default 0.5) and the reserved memory fraction for index/filters through `state.backend.rocksdb.memory.high-prio-pool-ratio` (by default 0.1). We also supply a `state.backend.rocksdb.memory.fixed-per-slot` configuration for manually control, but for experts only. More details, please refer to the Flink documents.
      Show
      After FLINK-7289 , we could control the memory usage of RocksDB state backend. By default user could set the RocksDB memory boundary through `taskmanager.memory.managed.size` or `taskmanager.memory.managed.fraction`, tune the write/read memory ratio through `state.backend.rocksdb.memory.write-buffer-ratio` (by default 0.5) and the reserved memory fraction for index/filters through `state.backend.rocksdb.memory.high-prio-pool-ratio` (by default 0.1). We also supply a `state.backend.rocksdb.memory.fixed-per-slot` configuration for manually control, but for experts only. More details, please refer to the Flink documents.

    Description

      Flink's RocksDB based state backend allocates native memory. The amount of allocated memory by RocksDB is not under the control of Flink or the JVM and can (theoretically) grow without limits.
      In container environments, this can be problematic because the process can exceed the memory budget of the container, and the process will get killed. Currently, there is no other option than trusting RocksDB to be well behaved and to follow its memory configurations. However, limiting RocksDB's memory usage is not as easy as setting a single limit parameter. The memory limit is determined by an interplay of several configuration parameters, which is almost impossible to get right for users. Even worse, multiple RocksDB instances can run inside the same process and make reasoning about the configuration also dependent on the Flink job.

      Some information about the memory management in RocksDB can be found here:
      https://github.com/facebook/rocksdb/wiki/Memory-usage-in-RocksDB
      https://github.com/facebook/rocksdb/wiki/RocksDB-Tuning-Guide

      We should try to figure out ways to help users in one or more of the following ways:

      • Some way to autotune or calculate the RocksDB configuration.
      • Conservative default values.
      • Additional documentation.

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned
            srichter Stefan Richter
            Votes:
            8 Vote for this issue
            Watchers:
            42 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 3h 40m
                3h 40m

                Slack

                  Issue deployment