Based on org.apache.flink.streaming.runtime.io.benchmark stuff and the repo flink-benchmark, I proposal to introduce another micro-benchmark which focuses on JobMaster schedule performance
Benchmark how long from JobMaster startup(receive the JobGraph and init) to all tasks RUNNING. Technically we use bounded stream and TM finishes tasks as soon as they arrived. So the real interval we measure is to all tasks FINISHED.
1. JobGraph that cover EAGER + PIPELINED edges
2. JobGraph that cover LAZY_FROM_SOURCES + PIPELINED edges
3. JobGraph that cover LAZY_FROM_SOURCES + BLOCKING edges
ps: maybe benchmark if the source is get from InputSplit?
Based on the flink-benchmark repo, we finally run benchmark using jmh. So the whole test suit is separated into two repos. The testing environment could be located in the main repo, maybe under flink-runtime/src/test/java/org/apache/flink/runtime/jobmaster/benchmark.
To measure the performance of JobMaster scheduling, we need to simulate an environment that:
1. has a real JobMaster
2. has a mock/testing ResourceManager that having infinite resource and react immediately.
3. has a(many?) mock/testing TaskExecutor that deploy and finish tasks immediately.
Any suggestions are welcome.