Description
-
- Why
https://github.com/apache/james-project/blob/master/src/adr/0014-blobstore-storage-policies.md
James exposes a simple BlobStore API for storing raw data. However such raw data often vary in size and access patterns.
As an example:
- Mailbox message headers are expected to be small and frequently accessed
- Mailbox message body are expected to have sizes ranging from small to big but are unfrequently accessed
- Mailbox attachments body are expected to have sizes ranging from small to big but are unfrequently accessed
- DeletedMessageVault message headers are expected to be small and unfrequently accessed
Caching frequently accessed data for which low latencies is expected is worth it.
Caching infrequently, non-latency-sensitive data like bodies, attachment is a waste of resources. We should stop doing it.
-
- Definition of done
Write an integration test demonstrating that small mail bodies do not end up in the blobStore cache.
-
- How
Upon reads (bytes or stream) the reader specifies which storage policy he whishes to use.
Upon LOW_COST reads the CachedBlobStore only queries the back-end, and don't query the cache, nor populate it. Today behavior happens when using HIGH_PERFORMANCE (query the cache and populates it if missing).
Use HIGH_PERFORMANCE read level when reading headers, LOW_COST otherwize.