Background: As an OLTP system that is based on pipelined execution worker pools can be saturated with long(er) running calls. When the system is under stress, those long running calls can make requests that should be short lived requests take a much longer period of time.
Introduce the concept of QoS into Cassandra for client queries. A few ideas:
1. Allow clients to specify a QoS to be sent to Cassandra from the driver as part of the protocol.
2. Allow different requests to be tagged based on some simple criteria (perhaps configured) (i.e. ALLOW FILTERING was part of the query)
3. QOS based on the LUN that is accessed (SSDs get higher QOS than the RAID5)