Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
0.14
-
None
Description
When a broker is federated with a cluster, the cluster informs the broker of the failover addresses that are valid for the cluster. Should a cluster member fail, the broker will reconnect to another member of that cluster.
However, the federated broker only queries the cluster for these failover addresses when it first connects to the cluster. Should the cluster topology change, the federated broker's list of available failover addresses will become out-of-date. This can prevent the broker from correctly re-connecting on failure of a cluster member.
Example:
Given cluster with members C1 and C2, and a separate broker B, federate B to connect to C1. On connecting to C1, B learns the addresses of C2 as an alternate failover address. Now shutdown C1. B will reconnect to C2, and learn that C2 is the only member of the cluster (ie. no failover addresses). After B connects, restart C1 and let it join the cluster. Then shutdown C2. Since B does not know that C1 has become available again, B will not attempt to re-connect to it. Instead, it tries to reconnect to C2 indefinately.
The expected behavior would be to have B reconnect to C1.