Building on work done by tjake (
CASSANDRA-10528), slebresne ( CASSANDRA-5239), and others, convert read and write request paths to be fully non-blocking, to enable the eventual transition from SEDA to TPC (CASSANDRA-10989)
Eliminate MUTATION, COUNTER_MUTATION, VIEW_MUTATION, READ, and READ_REPAIR stages, move read and write execution directly to Netty context.
For lack of decent async I/O options on Linux, we’ll still have to retain an extra thread pool for serving read requests for data not residing in our page cache (
Implementation-wise, we only have two options available to us: explicit FSMs and chained futures. Fibers would be the third, and easiest option, but aren’t feasible in Java without resorting to direct bytecode manipulation (ourselves or using quasar).
I have seen 4 implementations bases on chained futures/promises now - three in Java and one in C++ - and I’m not convinced that it’s the optimal (or sane) choice for representing our complex logic - think 2i quorum read requests with timeouts at all levels, read repair (blocking and non-blocking), and speculative retries in the mix, SERIAL reads and writes.
I’m currently leaning towards an implementation based on explicit FSMs, and intend to provide a prototype - soonish - for comparison with CompletableFuture-like variants.
Either way the transition is a relatively boring straightforward refactoring.
There are, however, some extension points on both write and read paths that we do not control:
- authorisation implementations will have to be non-blocking. We have control over built-in ones, but for any custom implementation we will have to execute them in a separate thread pool
- 2i hooks on the write path will need to be non-blocking
- any trigger implementations will not be allowed to block
- UDFs and UDAs
We are further limited by API compatibility restrictions in the 3.x line, forbidding us to alter, or add any non-default interface methods to those extension points, so these pose a problem.
Depending on logistics, expecting to get this done in time for 3.4 or 3.6 feature release.