We either need to (a) have a separate deserialization queue for "reply" traffic (we could use one of the "header" bits that isn't part of the Message proper to control this), or (b) drop messages for overloaded states on the floor so the deserializer doesn't overload, or (c) we need to give up the command/reply division entirely.
Alternatively, option (b) reminds me that instead of "backpressure" we could just "timeoutpressure," where instead of overloaded stages backpressuring message deserializer backpressuring socket reads, the deserializer can just discard messages the system is too busy to handle. The downside is, it will take an extra rpc_timeout latency before the clients start to get timeouts. The upside is, as things unclog the messages that get processed will be fresh ones, so we are less likely to waste work processing messages that the client isn't even waiting for anymore.
Also, I'd like to dynamically adjust stage capacity based on the amount of work that gets processed, rather than have a fixed value that has to be manually tuned. Not sure what that would look like – none of the Java BlockingQueue classes have adjustable capacity post-construction. But, stage enqueueing is only done in one place (by the deserializer executor) so we can one-off something if we have to.