Currently, YarnResourceManager starts all task executors in main thread. This could cause RM to become unresponsive when launching a large number of TEs (e.g. > 1000) because it involves blocking I/O operations (writing files to HDFS, communicating with the node manager using a synchronous NMClient). As a consequence, TE registration/heartbeat timeouts can occur and Flink might allocate too many excessive containers (see
FLINK-12342) because it cannot process the YarnResourceManager#onContainersAllocated calls.
There are different solution approaches but the end goal should be to not execute any blocking calls in the ResourceManager's main thread:
1. Start the TaskExecutors from a different thread (potentially thread pool) which is responsible for uploading the files and communicating with the NodeManager
2. Don't upload files (avoid blocking file system operations) and use the NMClientAsync for the communication with Yarn's NodeManager.
3. Upload files in a separate I/O thread and use the NMClientAsync for the communication with Yarn's NodeManager.