Description
There is a problem with Geode WAN replication when GW receivers are configured with the same hostname-for-senders and port on all servers. [ 1 ]
The problem experienced is that shutting down one server is stopping replication to this cluster until the server is up again. This is because Geode incorrectly assumes there are no more alive servers when just one of them is down, because since they share hostname-for-senders and port, they are treated as one same server.
Our proposal consists on expanding internal data in locators with enough information to distinguish servers in the beforementioned use case. The same intervention is likely needed in the client pools and possibly elsewhere in the source code.
[ 1 ] : The reason for such a setup is deploying Geode cluster on a Kubernetes cluster where all GW receivers are reachable from the outside world on the same VIP and port. Other kinds of configuration (different hostname and/or different port for each GW receiver) are not cheap from OAM and resources perspective in cloud native environments and also limit some important use-cases (like scaling).
Link to thread in DEV mailing list: https://markmail.org/thread/6qakx67rxiokdsec