Okay, take four here. The problem with v3 is that we were only blocking sendOneWay during shutdown, not addCallback, which is the source of the ExpiringMap entries we were waiting for.
As I commented,
* There isn't a good way to shut down the MessagingService. One problem (but not the only one)
* is that StorageProxy has no way to communicate back to clients, "I'm nominally alive, but I can't
* send that request to the nodes with your data." Neither TimedOut nor Unavailable is appropriate
* to return in that situation.
* So instead of shutting down MS and letting StorageProxy/clients cope somehow, we shut down
* the Thrift service and then wait for all the outstanding requests to finish or timeout.
That part was straightforward. I also had to make the Thrift shutdown actually work – we were calling setSoTimeout to attempt to make accept() nonblocking, but "0" means "wait indefinitely" not "don't wait at all". Then we needed to handle the timeout in the accept loop.
Finally, I did a bunch of cleanup to ExpiringMap and added trace-level logging in case we need to go at this again.