Falcon fail to restart, if any service in the cluster entity is not reachable or down.
For example, if there are clusters X, Y, Z. In cluster X, submit cluster entities which points to services of cluster Y & Z. Execute some replication jobs from cluster X to Y and even to cluster Z as well. If after certain duration, cluster Z HDFS service is down due to maintenance activity and at the same time we require to restart Falcon service on cluster X due to some reason, then Falcon will fail to restart on cluster X.
This issue has been reported internally at Hortonworks.
- links to