Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-6580

Flink on YARN doesnt start with default parameters

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: 1.3.0
    • Fix Version/s: 1.3.0, 1.4.0
    • Component/s: YARN
    • Labels:
      None

      Description

      Just doing ./bin/yarn-session.sh -n 1 fails with

      Error while deploying YARN cluster: Couldn't deploy Yarn cluster
      java.lang.RuntimeException: Couldn't deploy Yarn cluster
      	at org.apache.flink.yarn.AbstractYarnClusterDescriptor.deploy(AbstractYarnClusterDescriptor.java:436)
      	at org.apache.flink.yarn.cli.FlinkYarnSessionCli.run(FlinkYarnSessionCli.java:626)
      	at org.apache.flink.yarn.cli.FlinkYarnSessionCli$1.call(FlinkYarnSessionCli.java:482)
      	at org.apache.flink.yarn.cli.FlinkYarnSessionCli$1.call(FlinkYarnSessionCli.java:479)
      	at org.apache.flink.runtime.security.HadoopSecurityContext$1.run(HadoopSecurityContext.java:43)
      	at java.security.AccessController.doPrivileged(Native Method)
      	at javax.security.auth.Subject.doAs(Subject.java:415)
      	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
      	at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:40)
      	at org.apache.flink.yarn.cli.FlinkYarnSessionCli.main(FlinkYarnSessionCli.java:479)
      Caused by: java.lang.IllegalArgumentException: The configuration value 'containerized.heap-cutoff-min' is higher (600) than the requested amount of memory 256
      	at org.apache.flink.yarn.Utils.calculateHeapSize(Utils.java:100)
      	at org.apache.flink.yarn.AbstractYarnClusterDescriptor.setupApplicationMasterContainer(AbstractYarnClusterDescriptor.java:1263)
      	at org.apache.flink.yarn.AbstractYarnClusterDescriptor.startAppMaster(AbstractYarnClusterDescriptor.java:803)
      	at org.apache.flink.yarn.AbstractYarnClusterDescriptor.deployInternal(AbstractYarnClusterDescriptor.java:568)
      	at org.apache.flink.yarn.AbstractYarnClusterDescriptor.deploy(AbstractYarnClusterDescriptor.java:434)
      	... 9 more
      
      

      I think this issue has been introduced in FLINK-5904.
      Flink on YARN is now using the configuration parameters from the configuration file.

        Issue Links

          Activity

          Hide
          rmetzger Robert Metzger added a comment -

          To fix the issue, I propose to sync with defaults in flink-conf.yaml with JobManagerOptions.

          Show
          rmetzger Robert Metzger added a comment - To fix the issue, I propose to sync with defaults in flink-conf.yaml with JobManagerOptions .
          Hide
          githubbot ASF GitHub Bot added a comment -

          GitHub user rmetzger opened a pull request:

          https://github.com/apache/flink/pull/3900

          FLINK-6580 Sync default heap sizes from code with config file

          Flink didn't start on YARN anymore without explicit configuration of JM and TM heap because the default heap sizes in the yaml file were set too low.
          This PR increases the default heap sizes according to the `TaskManagerOptions` and `JobManagerOptions` classes.

          You can merge this pull request into a Git repository by running:

          $ git pull https://github.com/rmetzger/flink FLINK-6580

          Alternatively you can review and apply these changes as the patch at:

          https://github.com/apache/flink/pull/3900.patch

          To close this pull request, make a commit to your master/trunk branch
          with (at least) the following in the commit message:

          This closes #3900


          commit d115416613313f5d01a47e92c33e87b3075008e8
          Author: Robert Metzger <rmetzger@apache.org>
          Date: 2017-05-15T09:15:48Z

          FLINK-6580 Sync default heap sizes from code with config file


          Show
          githubbot ASF GitHub Bot added a comment - GitHub user rmetzger opened a pull request: https://github.com/apache/flink/pull/3900 FLINK-6580 Sync default heap sizes from code with config file Flink didn't start on YARN anymore without explicit configuration of JM and TM heap because the default heap sizes in the yaml file were set too low. This PR increases the default heap sizes according to the `TaskManagerOptions` and `JobManagerOptions` classes. You can merge this pull request into a Git repository by running: $ git pull https://github.com/rmetzger/flink FLINK-6580 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/3900.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3900 commit d115416613313f5d01a47e92c33e87b3075008e8 Author: Robert Metzger <rmetzger@apache.org> Date: 2017-05-15T09:15:48Z FLINK-6580 Sync default heap sizes from code with config file
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user zentol commented on the issue:

          https://github.com/apache/flink/pull/3900

          Do we know the underlying reason that is causing YARN to fail now? None of the config values seem to have changed in a long time.

          Show
          githubbot ASF GitHub Bot added a comment - Github user zentol commented on the issue: https://github.com/apache/flink/pull/3900 Do we know the underlying reason that is causing YARN to fail now? None of the config values seem to have changed in a long time.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user zentol commented on the issue:

          https://github.com/apache/flink/pull/3900

          nvm, found the reference to FLINK-5904 in the JIRA.

          Show
          githubbot ASF GitHub Bot added a comment - Github user zentol commented on the issue: https://github.com/apache/flink/pull/3900 nvm, found the reference to FLINK-5904 in the JIRA.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user zentol commented on a diff in the pull request:

          https://github.com/apache/flink/pull/3900#discussion_r116487520

          — Diff: flink-core/src/main/java/org/apache/flink/configuration/JobManagerOptions.java —
          @@ -62,7 +62,7 @@
          */
          public static final ConfigOption<Integer> JOB_MANAGER_HEAP_MEMORY =
          key("jobmanager.heap.mb")

          • .defaultValue(1024);
            + .defaultValue(768);
              • End diff –

          Why not 1024?

          Show
          githubbot ASF GitHub Bot added a comment - Github user zentol commented on a diff in the pull request: https://github.com/apache/flink/pull/3900#discussion_r116487520 — Diff: flink-core/src/main/java/org/apache/flink/configuration/JobManagerOptions.java — @@ -62,7 +62,7 @@ */ public static final ConfigOption<Integer> JOB_MANAGER_HEAP_MEMORY = key("jobmanager.heap.mb") .defaultValue(1024); + .defaultValue(768); End diff – Why not 1024?
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user rmetzger commented on a diff in the pull request:

          https://github.com/apache/flink/pull/3900#discussion_r116492253

          — Diff: flink-core/src/main/java/org/apache/flink/configuration/JobManagerOptions.java —
          @@ -62,7 +62,7 @@
          */
          public static final ConfigOption<Integer> JOB_MANAGER_HEAP_MEMORY =
          key("jobmanager.heap.mb")

          • .defaultValue(1024);
            + .defaultValue(768);
              • End diff –

          Yeah, I'll go back to 1024.

          Show
          githubbot ASF GitHub Bot added a comment - Github user rmetzger commented on a diff in the pull request: https://github.com/apache/flink/pull/3900#discussion_r116492253 — Diff: flink-core/src/main/java/org/apache/flink/configuration/JobManagerOptions.java — @@ -62,7 +62,7 @@ */ public static final ConfigOption<Integer> JOB_MANAGER_HEAP_MEMORY = key("jobmanager.heap.mb") .defaultValue(1024); + .defaultValue(768); End diff – Yeah, I'll go back to 1024.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user rmetzger commented on the issue:

          https://github.com/apache/flink/pull/3900

          @zentol do you agree to merge this now. Its the last thing I would like to get into RC1.

          Show
          githubbot ASF GitHub Bot added a comment - Github user rmetzger commented on the issue: https://github.com/apache/flink/pull/3900 @zentol do you agree to merge this now. Its the last thing I would like to get into RC1.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user zentol commented on the issue:

          https://github.com/apache/flink/pull/3900

          +1 to merge after you've changed it back to 1024.

          Show
          githubbot ASF GitHub Bot added a comment - Github user zentol commented on the issue: https://github.com/apache/flink/pull/3900 +1 to merge after you've changed it back to 1024.
          Show
          rmetzger Robert Metzger added a comment - Resolved for 1.3 in http://git-wip-us.apache.org/repos/asf/flink/commit/89e90f31 For 1.4 in http://git-wip-us.apache.org/repos/asf/flink/commit/55010d0b
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user asfgit closed the pull request at:

          https://github.com/apache/flink/pull/3900

          Show
          githubbot ASF GitHub Bot added a comment - Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/3900

            People

            • Assignee:
              rmetzger Robert Metzger
              Reporter:
              rmetzger Robert Metzger
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development