Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-16406

Increase default value for JVM Metaspace to minimise its OutOfMemoryError

    XMLWordPrintableJSON

Details

    Description

      With FLIP-49 (FLINK-13980), we introduced a limit for JVM Metaspace ('taskmanager.memory.jvm-metaspace.size') when TM JVM process is started. It caused 'OutOfMemoryError: Metaspace' for some use cases after upgrading to the latest 1.10 version. In some cases, a real class loading leak has been discovered, like in FLINK-16142. Some users had to increase the default value to accommodate for their use cases (mostly from 96Mb to 256Mb).

      While this limit was introduced to properly plan Flink resources, especially for container environment, and to detect class loading leaks, the user experience should be as smooth as possible. One way is provide good documentation for this change (FLINK-16278).

      Another question is the sanity of the default value. It is still arguable what the default value should be (currently 96Mb). In general, the size depends on the use case (job user code, how many jobs are deployed in the cluster etc).

      This issue tries to tackle this problem by firstly increasing it to 256Mb and overall default process size to 1728Mb in flink-conf.yaml to have no impact on default sizes of other memory components. We also want to poll which Metaspace setting resolved the OutOfMemoryError. Please, if you encountered this problem, report here any relevant specifics of your job and your Metaspace size if there was no class loading leak.

      Attachments

        Issue Links

          Activity

            People

              azagrebin Andrey Zagrebin
              azagrebin Andrey Zagrebin
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 20m
                  20m