Details
Description
In the following sequence of events, the Java client doesn't properly fail over to locate a new master, and in fact gets "stuck" until the client is restarted:
- client connects to the cluster and caches the master locations
- client opens a table and caches tablet locations
- the master fails over to a new leader
- the tablet either goes down or fails over, causing the client to need to update its tablet locations
In this case, it gets stuck in a retry loop where it will never be able to connect to the new leader master.