Estimates how parallel batches count affects load time of same amount of persistent data. The lesser score (average time load), the better. Results in short: looks like 1 batch per thread is enough. But we can keep x2 for network issues. servers - server nodes maxDsOps - streamer's batches per node sendMsgDelay - simulated network delay or write responses. ================================================================================================================================================================== Benchmark (cacheWriteMode) (maxDsOps) (sendMsgDelay) (servers) (walMode) Mode Cnt Score Error Units SERVERS: 2 JmhPersistentStreamerReceiverBenchmark.benchDefaultIsolated PRIMARY_SYNC 4 3 2 LOG_ONLY avgt 5 11,825 ± 13,959 s/op >>> [Load time is same since `CPUs * 1` / 16 batches] JmhPersistentStreamerReceiverBenchmark.benchDefaultIsolated PRIMARY_SYNC 8 3 2 LOG_ONLY avgt 5 9,138 ± 5,238 s/op JmhPersistentStreamerReceiverBenchmark.benchDefaultIsolated PRIMARY_SYNC 16 3 2 LOG_ONLY avgt 5 7,334 ± 5,799 s/op JmhPersistentStreamerReceiverBenchmark.benchDefaultIsolated PRIMARY_SYNC 32 3 2 LOG_ONLY avgt 5 7,247 ± 5,518 s/op JmhPersistentStreamerReceiverBenchmark.benchDefaultIsolated PRIMARY_SYNC 64 3 2 LOG_ONLY avgt 5 8,529 ± 1,681 s/op JmhPersistentStreamerReceiverBenchmark.benchDefaultIsolated PRIMARY_SYNC 128 3 2 LOG_ONLY avgt 5 9,001 ± 2,424 s/op === FULL_SYNC: JmhPersistentStreamerReceiverBenchmark.benchDefaultIsolated FULL_SYNC 4 3 2 LOG_ONLY avgt 5 13,336 ± 4,264 s/op JmhPersistentStreamerReceiverBenchmark.benchDefaultIsolated FULL_SYNC 8 3 2 LOG_ONLY avgt 5 10,416 ± 3,984 s/op >>> [Load time is same `CPUs * 1`] JmhPersistentStreamerReceiverBenchmark.benchDefaultIsolated FULL_SYNC 16 3 2 LOG_ONLY avgt 5 8,394 ± 5,186 s/op JmhPersistentStreamerReceiverBenchmark.benchDefaultIsolated FULL_SYNC 32 3 2 LOG_ONLY avgt 5 9,273 ± 8,043 s/op JmhPersistentStreamerReceiverBenchmark.benchDefaultIsolated FULL_SYNC 64 3 2 LOG_ONLY avgt 5 8,088 ± 6,422 s/op JmhPersistentStreamerReceiverBenchmark.benchDefaultIsolated FULL_SYNC 128 3 2 LOG_ONLY avgt 5 8,809 ± 5,519 s/op SERVERS: 3 JmhPersistentStreamerReceiverBenchmark.benchDefaultIsolated PRIMARY_SYNC 4 3 3 LOG_ONLY avgt 5 14,594 ± 10,666 s/op JmhPersistentStreamerReceiverBenchmark.benchDefaultIsolated PRIMARY_SYNC 8 3 3 LOG_ONLY avgt 5 15,176 ± 7,848 s/op >>> [Load time is same `CPUs * 1`] JmhPersistentStreamerReceiverBenchmark.benchDefaultIsolated PRIMARY_SYNC 16 3 3 LOG_ONLY avgt 5 9,578 ± 4,345 s/op JmhPersistentStreamerReceiverBenchmark.benchDefaultIsolated PRIMARY_SYNC 32 3 3 LOG_ONLY avgt 5 10,280 ± 4,121 s/op JmhPersistentStreamerReceiverBenchmark.benchDefaultIsolated PRIMARY_SYNC 64 3 3 LOG_ONLY avgt 5 8,976 ± 3,753 s/op JmhPersistentStreamerReceiverBenchmark.benchDefaultIsolated PRIMARY_SYNC 128 3 3 LOG_ONLY avgt 5 9,295 ± 4,269 s/op === FULL_SYNC: JmhPersistentStreamerReceiverBenchmark.benchDefaultIsolated FULL_SYNC 4 3 3 LOG_ONLY avgt 5 21,961 ± 12,751 s/op JmhPersistentStreamerReceiverBenchmark.benchDefaultIsolated FULL_SYNC 8 3 3 LOG_ONLY avgt 5 16,261 ± 12,808 s/op >>> [Load time is same `CPUs * 1`] JmhPersistentStreamerReceiverBenchmark.benchDefaultIsolated FULL_SYNC 16 3 3 LOG_ONLY avgt 5 10,426 ± 9,156 s/op JmhPersistentStreamerReceiverBenchmark.benchDefaultIsolated FULL_SYNC 32 3 3 LOG_ONLY avgt 5 10,237 ± 2,163 s/op JmhPersistentStreamerReceiverBenchmark.benchDefaultIsolated FULL_SYNC 64 3 3 LOG_ONLY avgt 5 10,551 ± 3,310 s/op JmhPersistentStreamerReceiverBenchmark.benchDefaultIsolated FULL_SYNC 128 3 3 LOG_ONLY avgt 5 9,811 ± 2,689 s/op ================================================================================================================================================================== OS: Linux void 5.18.0-4-amd64 #1 SMP PREEMPT_DYNAMIC Debian 5.18.16-1 (2022-08-10) x86_64 GNU/Linux JMH 1.13 VM version: JDK 11.0.2, VM 11.0.2+9 VM options: -Xms2g -Xmx2g -server -XX:+AlwaysPreTouch CPU: vendor_id : AuthenticAMD cpu family : 25 model : 80 model name : AMD Ryzen 7 5800U with Radeon Graphics cpu MHz : 1600.000 cache size : 512 KB siblings : 16 cpu cores : 8 cpuid level : 16 Disk: `dd if=/dev/zero of=/tmp/test1.img bs=1G count=1 oflag=dsync`: 1073741824 bytes (1,1 GB, 1,0 GiB) copied, 1,25661 s, 854 MB/s `dd if=/dev/zero of=/tmp/test2.img bs=512 count=1000 oflag=dsync`: 512000 bytes (512 kB, 500 KiB) copied, 2,80265 s, 183 kB/s