Description
The planner splits does the planning in multiple stages when it finds job dependencies on ReadableData. One example of this case is when using the BloomFilterJoinStrategy.
While the generated plan dot file looks good, the planner actually does not add dependencies between jobs that are created in different planning stages.
I have a pipeline that reads 3 input sources. It joins 2 of them using a bloom filter join strategy. Later on, it joins this with the output of a job coming from the third source path.
In the case the jobs on the branch using the bloom filter finish before the one reading the third source, the executor attempts to start the 4-th job that is supposed to join everything before the 3-rd one finish, resulting in a input Path not found exception.