Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-16279

Per job Yarn application leak in normal execution mode.

    XMLWordPrintableJSON

Details

    Description

      I run a job in yarn per job mode using env.executeAsync, the job failed but the yarn cluster didn't be destroyed.

      After some research on the code, I found that:

      when running in attached mode, MiniDispatcher will never set shutDownfuture before received a request from job client.

      		if (executionMode == ClusterEntrypoint.ExecutionMode.NORMAL) {
      			// terminate the MiniDispatcher once we served the first JobResult successfully
      			jobResultFuture.thenAccept((JobResult result) -> {
      				ApplicationStatus status = result.getSerializedThrowable().isPresent() ?
      						ApplicationStatus.FAILED : ApplicationStatus.SUCCEEDED;
      
      				LOG.debug("Shutting down per-job cluster because someone retrieved the job result.");
      				shutDownFuture.complete(status);
      			});
      		} 
      

      However, when running in async mode(submit job by env.executeAsync), there may be no request from job client because when a user find that the job is failed from job client, he may never request the result again.

      Attachments

        Activity

          People

            Unassigned Unassigned
            wenlong.lwl Wenlong Lyu
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated: