assuming that the sync period is sane (e.g. ~100ms)
The sync period is, by default, 10s, and to my knowledge this is what many users run with - so in general we will only compress each individual segment. This is still sane, since the cluster has redundancy, although a sync period of between 100ms and 500ms might be more suitable for high traffic nodes. Still, it's probably not a big deal since we only care about compression when under saturation, which should mean many segments. I only mention it, since it is an easy extension. This extension also means the sync thread may have compressed data waiting for it when it runs, reducing the latency until sync completion.
Let me try to rephrase what you are saying to make sure I understand it correctly:
- single sync thread forms sections at regular time intervals and sends them to compression executor/phase (SPMC queue),
- sync thread waits on futures and syncs each in order
Or, with the extension:
- mutators periodically submit segment to compressor
- once compressor completes an entire segment, requestExtraSync() is called (instead of in advanceAllocatingFrom())
Why is this simpler, or of comparable complexity?
We have two steps in explanation, instead of five. More importantly, there is no interleaving of events to reason about between the sync threads, and the "lastSync" is accurate (which is important since this could artificially pause writes). This also means future improvements here are easier and safer to deliver, because we don't have to reason about how they interplay with each other. In particular, marking lastSync roll over after each segment is synced is a natural improvement (to ensure write latencies don't spike under load) but is challenging to introduce with multiple sync threads. Since we don't expect this feature to be used widely (we expect multiple CL disks to be used instead, if you're bottlenecking) the simpler approach seems more sensible to me.
Wouldn't the two extra queues waste resources and increase latency?
We have zero in the typical case, and one extra queue in the uncommon use case. If we introduce enough threads that compression is faster than disk, then there will be near zero synchronization costs; of course, if that is not the case, and we are bottlenecking on compression still, then we aren't really losing much (a few micros. every few hundred millis, at 250MB/s compression speed), so it doesn't seem likely to be significant.
We're now no longer honouring the sync interval; we are syncing more frequently, which may reduce disk throughput. The exact time of syncing in relation to each other may also vary, likely into lock-step under saturation, so that there may be short periods of many competing syncs potentially yielding pathological disk behaviour, and introducing competition for the synchronized blocks inside the segments, in effect introducing a MPMC queue, eliminating those few micros of benefit.
(FTR, the MPMC, SPMC, MPSC aspects are likely not important here. The only concern is thread signalling, but this is the wrong order of magnitude to matter when bottlenecking on disk or compression of large chunks)