Affects Version/s: None
Fix Version/s: None
Here is, what i am looking for:
- We have flink HA dockerized cluster with (3 zookeepers, 2 job-managers, 3 task-managers)
- So whenever we are cancelling the flink job, it is getting cancelled but it is not deleting the cancelled job ID from the zookeeper metadata (Inside flink/jobgraph folder in zookeeper) automatically.
- So whenever any one of the job-manager goes down/restarted , it doesn't come up and throws exception like "Could not find this job id xxxxxxxxxx".
- The current work around is to remove the canceled job ID from the zookeeper metadata manually. (But this is not the recommended solution).