Following up from
CASSANDRA-8897, there are further improvements that can be made to the BufferPool:
- The common code paths can be made non-atomic.
- The chunk pool can be turned into a Stack, instead of a Queue, to improve the likelihood of cache presence
- The chunk pool can be made processor-local, using e.g. https://github.com/OpenHFT/Java-Thread-Affinity
- We can support smaller allocations by creating micro-chunks within each local pool, by allocating a single unit from the current chunk (or multiple units if we're about to discard a chunk that is not fully utilised).
- It should be possible to generalise this approach to make the entire allocation stack tiered, so that whenever you want a new chunk you go to the parent chunk that is an order of magnitude larger, and allocate a small slice (which you convert into a Chunk). Slices below a certain size can be taken exclusive ownership of, and above a certain size they remain shared.