We had been documenting that SMTP stack is slow and generate a lot of GC (https://github.com/apache/james-project/pull/309#issuecomment-786358253)
Under the hood, in memory structures are being used by the distributed server (MailStore, for both enqueuing and dequeuing emails), copies are enforced by inefficient APIs (eg InputStream) that prevents replaying the content (while we can!) and forces defensive copies.
Out of this diagnosis, I started experimenting in https://github.com/apache/james-project/pull/309 trying to apply the following principles on the SMTP write path:
- Avoid holding in memory data structures
- Avoid copies by allowing stream generation
My work show that I achieved a x3 SMTP throughput improvement. All dbs, including a zenko S3 server, being hosted on the same server of my test infra, I expect the gains to be even higher for real deployments.
This work on efficiency should largely outweight the performance impacts of
I would wish this work makes it to the future 3.6.0 release.
On the upcoming topics of attention in my head that might see related works is the APPEND command buggy inMemorySize limits (exceeding the size limit causes the APPEND to crash), thus as a temporary remediation we did enforce a higher memory limit, hence defeating the above mentioned principles. I would prefer seeing there a FileBackedOutputStream with a replayable byte source, achieving similar enhencements for the APPEND command.