Description
Client configuration requires all masters to be specified while building the kudu client and connecting to the cluster.
With KUDU-2200 the requirement was relaxed wherein as long as the leader master is included in the list, connecting to the cluster will be successful. Comment on the JIRA recommends against relaxing the requirement to specify any master as that master could fail.
With the common case of configuring multiple masters from a single master configuration (1 -> 3), it’s possible that one of the newly added masters becomes the leader. In such a case, if the client configuration is not updated the requests to the cluster will start failing as per current implementation.
Relaxing the requirement that any master can be specified in the client configuration will not require the client configuration to be updated immediately as long as any of the specified masters continues to be part of multi-master config even as a follower. For the case when all the masters specified in client configuration fail despite having at least one healthy master, the requests to the cluster will start failing which is no different than the current implementation.
As long any master, irrespective of whether it’s a leader, specified in the client configuration is part of the current multi-master configuration the cluster will continue to function.
Adding comment from granthenke delving into implementation
“Currently we require that all masters are listed in client configurations. Instead we could require at least one is listed and as a part of the first step in connecting a client send a GetMasterAddresses request to discover the others. This can be done in a round robin fashion until a live master is found and reports all the configured masters. In this way users can migrate their master/masters and then slowly adjust the master list over time. This could also allow them to use a loadbalancer/proxy to resolve one of the master addresses and then get the real addresses in the first response.
In fact even tablet servers could be used to bootstrap the list of masters. Optionally using something like a cluster id could be helpful so users can be confident the bootstrap to the correct cluster.”
From an implementation perspective, if the leader is not part of the client configuration then the client should run the loop once again to determine the leader. Once against the list of configured masters as specified in the client configuration and if leader is not part of the client configuration then next against list of masters as returned by the follower masters in the client configuration. In case of discrepancy between masters returned by configured master(s), it’s safer to fail determining the leader.
Relaxing the master leader requirement in client configuration will allow clients to slowly migrate to the updated list of master addresses when new masters are added.
Another enhancement, suggested by Grant, would be addition of cluster id to ensure masters are added/removed from the correct cluster and prevents accidental misconfiguration.