Details
-
Improvement
-
Status: Resolved
-
Normal
-
Resolution: Fixed
Description
Compressing uncompressible data, as done, for instance, to write SSTables during stress-tests, results in chunks larger than 64k which are a problem for the buffer pooling mechanisms employed by the CompressedRandomAccessReader. This results in non-negligible performance issues due to excessive memory allocation.
To solve this problem and avoid decompression delays in the cases where it does not provide benefits, I think we should allow compressed files to store uncompressed chunks as alternative to compressed data. Such a chunk could be written after compression returns a buffer larger than, for example, 90% of the input, and would not result in additional delays in writing. On reads it could be recognized by size (using a single global threshold constant in the compression metadata) and data could be directly transferred into the decompressed buffer, skipping the decompression step and ensuring a 64k buffer for compressed data always suffices.
Attachments
Attachments
Issue Links
- Blocked
-
CASSANDRA-13703 Using min_compress_ratio <= 1 causes corruption
- Resolved
- blocks
-
CASSANDRA-13250 dtest failure in cqlsh_tests.cqlsh_tests.TestCqlsh.test_describe
- Resolved
- is related to
-
CASSANDRA-13274 Fix code to not exchange schema across major versions
- Resolved
- relates to
-
CASSANDRA-13899 Fix buffer length comparison when decompressing in netty-based streaming
- Resolved
-
CASSANDRA-8895 Compressed sstables should only compress if the win is above a certain threshold, and should use a variable block size
- Open
-
CASSANDRA-10407 Benchmark and evaluate CASSANDRA-8894 improvements
- Open