Uploaded image for project: 'Flink'
  1. Flink
  2. FLINK-23194

Cache and reuse the ContainerLaunchContext and accelarate the progress of createTaskExecutorLaunchContext on yarn

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Won't Do
    • 1.13.1, 1.12.4
    • None
    • Deployment / YARN
    • None

    Description

      When starting the TaskExecutor in container on yarn, this will create ContainerLaunchContext for n times(n represent the number of the TaskManager).

      When I examined the progress of this creation, I found that most of them were in common and had nothing to do with the particular TaskManager except the launchCommand. We can create ContainerLaunchContext once and reuse it. Only the launchCommand need to create separately for every particular TaskManager.

      So I propose that we can cache and reuse the ContainerLaunchContext object to accelerate this creation progress. 

      I think this can have some benefit like below:

      1. this can accelerate the creation of ContainerLaunchContext and also the start of the TaskExecutor, especially under the situation of massive TaskManager.
      2. this can decrease the pressure of the HDFS, etc. 
      3. this can also avoid the suddenly failure of the HDFS or yarn, etc.

      We have implemented this on our production environment. So far there has no problem and have a good benefit. Please let me know if there's any point that I haven't considered.

      Attachments

        Activity

          People

            Unassigned Unassigned
            zlzhang0122 zlzhang0122
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: