Because drivers abstract from users the handling of up/down nodes, they have to deal with the fact that when a node is restarted (or join), it won't know any prepared statement. Drivers could somewhat ignore that problem and wait for a query to return an error (that the statement is unknown by the node) to re-prepare the query on that node, but it's relatively inefficient because every time a node comes back up, you'll get bad latency spikes due to some queries first failing, then being re-prepared and then only being executed. So instead, drivers (at least the java driver but I believe others do as well) pro-actively re-prepare statements when a node comes up. It solves the latency problem, but currently every driver instance blindly re-prepare all statements, meaning that in a large cluster with many clients there is a lot of duplication of work (it would be enough for a single client to prepare the statements) and a bigger than necessary load on the node that started.
An idea to solve this it to have a (cheap) way for clients to check if some statements are prepared on the node. There is different options to provide that but what I'd suggest is to add a system table to expose the (cached) prepared statements because:
- it's reasonably straightforward to implement: we just add a line to the table when a statement is prepared and remove it when it's evicted (we already have eviction listeners). We'd also truncate the table on startup but that's easy enough). We can even switch it to a "virtual table" if/when
CASSANDRA-7622lands but it's trivial to do with a normal table in the meantime.
- it doesn't require a change to the protocol or something like that. It could even be done in 2.1 if we wish to.
- exposing prepared statements feels like a genuinely useful information to have (outside of the problem exposed here that is), if only for debugging/educational purposes.
The exposed table could look something like: