Details
-
Improvement
-
Status: Open
-
Normal
-
Resolution: Unresolved
-
None
-
None
-
Performance
-
Normal
-
All
-
None
Description
On read heavy workloads Cassandra performs much better when using a low read ahead setting. In my tests I've seen an 5x improvement in throughput and more than a 50% reduction in latency. However, I've also observed that it can have a negative impact on compaction and streaming throughput. It especially negatively impacts cloud environments where small reads incur high costs in IOPS due to tiny requests.
- We should investigate using POSIX_FADV_DONTNEED on files we're compacting to see if we can improve performance and reduce page faults.
- This should be combined with an internal read ahead style buffer that Cassandra manages, similar to a BufferedInputStream but with our own machinery. This buffer should read fairly large blocks of data off disk at at time. EBS, for example, allows 1 IOP to be up to 256KB. A considerable amount of time is spent in blocking I/O during compaction and streaming. Reducing the frequency we read from disk should speed up all sequential I/O operations.
- We can reduce system calls by buffering writes as well, but I think it will have less of an impact than the reads
Attachments
Attachments
Issue Links
- is duplicated by
-
CASSANDRA-19607 Compaction double reads every chunk
- Resolved
-
CASSANDRA-19494 Optimize I/O during table scans
- Resolved