On one of my production instances, I notice some copy operations are slow. Moving 60 messages takes around 2 seconds (~33ms per message).
More interestingly, a total of 1042 Cassandra queries is generated! (~17 per messages)
The moves is currently performed on a per message basis, sequencially.
However, by grouping updates together we can:
- Allocate a single MODSEQ thus saving on ModSeq generation
- Allocate several UIDs at once by asking for a UID range
- As we are no longer performing id generation for each message, we can parallelize the message insertion...
- And the tables indexes (applicable flags, mailbox counters) can be grouped instead of being performed for each messages. Other table indexes updates can be further parallelized yielding further enhancements.
In brief, according to the glowroot capture attached we can expect a 75% performance enhancement by:
- Cassandra query volume reduction
- Operation parallelization
We also expect a positive impact on overall Cassandra performances from the above enhancements.
ASYNC Transaction type: Web Transaction name: /jmap Start: 2020-12-22 3:40:57.645 pm (+07:00) Duration: 2,085.1 milliseconds Breakdown (Main Thread): total (ms) count http request 0.46 1 Breakdown (Auxiliary Threads): total (ms) count auxiliary thread 2,744.8 6,959 jmapMethod 1,936.4 1 cassandra query 50.2 1,042 Breakdown (Async Timers): total (ms) count cassandra query 3,907.5 1,042 JVM Thread Stats (Main Thread) CPU time: 0.42 milliseconds Blocked time: 0.0 milliseconds Waited time: 0.0 milliseconds Allocated memory: 18.5 KB JVM Thread Stats (Auxiliary Threads) CPU time: 489.6 milliseconds Blocked time: 0.0 milliseconds Waited time: 1,924.0 milliseconds Allocated memory: 17.6 MB