When a member is forced out of the distributed system, if disable-auto-reconnect=false (the default), then it will attempt to close its cache, disconnect from the cluster, and then reconnect and create a new cache. Because of the way this is implemented, the old cache is kept in memory while the new cache is being created. This can end up causing reconnect to use much more memory then it needs. That memory will be freed after the reconnect completes, but it is possible for this to cause the JVM to run out of memory during the reconnect.
So far I have found two places that keep the old cache around:
1. InternalDistributedSystem.tryReconnect is passed the old cache as a parameter. Only one caller exists and only a small block of code in tryReconnect needs the old cache. So it would be easy to fix this by not passing it in as a parameter.
2. InternalDistributedSystem.reconnect (called by tryReconnect) keeps the old cache in a local variable "cache". It only needs it to initialize "cacheXML" and "cacheServerCreation". So once those are initialized it would be easy to drop this ref. But cacheServerCreation also contains references to the old cache through the "cache" instance variable on CacheServerCreation.