Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-4488

Prevent cluster shutdown after job execution for non-detached jobs

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 1.2.0, 1.1.1
    • Fix Version/s: 1.2.0, 1.1.2
    • Component/s: YARN
    • Labels:
      None

      Description

      In per-job mode, the Yarn cluster currently shuts down after the first interactively executed job. Users may want to execute multiple jobs in one Jar. I would suggest to use this mechanism only for jobs which run detached. For interactive jobs, shutdown of the cluster is additionally handled by the CLI which should be sufficient to ensure cluster shutdown. Cluster shutdown could only become a problem in case of a network partition to the cluster or outage of the CLI.

        Issue Links

          Activity

          Hide
          githubbot ASF GitHub Bot added a comment -

          GitHub user mxm opened a pull request:

          https://github.com/apache/flink/pull/2419

          FLINK-4488 only automatically shutdown clusters for detached jobs

          You can merge this pull request into a Git repository by running:

          $ git pull https://github.com/mxm/flink FLINK-4488

          Alternatively you can review and apply these changes as the patch at:

          https://github.com/apache/flink/pull/2419.patch

          To close this pull request, make a commit to your master/trunk branch
          with (at least) the following in the commit message:

          This closes #2419


          commit 8623083bc0147d89ce433da677ff2d4ed6ecd768
          Author: Maximilian Michels <mxm@apache.org>
          Date: 2016-08-25T13:41:12Z

          FLINK-4488 only automatically shutdown clusters for detached jobs


          Show
          githubbot ASF GitHub Bot added a comment - GitHub user mxm opened a pull request: https://github.com/apache/flink/pull/2419 FLINK-4488 only automatically shutdown clusters for detached jobs You can merge this pull request into a Git repository by running: $ git pull https://github.com/mxm/flink FLINK-4488 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/2419.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2419 commit 8623083bc0147d89ce433da677ff2d4ed6ecd768 Author: Maximilian Michels <mxm@apache.org> Date: 2016-08-25T13:41:12Z FLINK-4488 only automatically shutdown clusters for detached jobs
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user StephanEwen commented on the issue:

          https://github.com/apache/flink/pull/2419

          That makes sense, +1 to merge this for 1.1.2 and 1.2

          Wondering if we can guard this via a test, so that the FLIP-6 refactoring does not re-introduce the bug.

          Show
          githubbot ASF GitHub Bot added a comment - Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/2419 That makes sense, +1 to merge this for 1.1.2 and 1.2 Wondering if we can guard this via a test, so that the FLIP-6 refactoring does not re-introduce the bug.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user mxm commented on the issue:

          https://github.com/apache/flink/pull/2419

          Our currents tests make it difficult to test such behavior. Added a check to the `YarnTestBase`. Basically, I'm skipping the cluster shutdown to check if the JobManager is still alive and hasn't been shutdown through other means.

          Show
          githubbot ASF GitHub Bot added a comment - Github user mxm commented on the issue: https://github.com/apache/flink/pull/2419 Our currents tests make it difficult to test such behavior. Added a check to the `YarnTestBase`. Basically, I'm skipping the cluster shutdown to check if the JobManager is still alive and hasn't been shutdown through other means.
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user asfgit closed the pull request at:

          https://github.com/apache/flink/pull/2419

          Show
          githubbot ASF GitHub Bot added a comment - Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/2419
          Hide
          mxm Maximilian Michels added a comment -

          master: 87114cd2aaf78de2114d1ea4ab7bd2b57494d716
          release-1.1: 28da995494ad21223f6911f27eb46187294f311a

          Show
          mxm Maximilian Michels added a comment - master: 87114cd2aaf78de2114d1ea4ab7bd2b57494d716 release-1.1: 28da995494ad21223f6911f27eb46187294f311a

            People

            • Assignee:
              mxm Maximilian Michels
              Reporter:
              mxm Maximilian Michels
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development