[CASSANDRA-15299] CASSANDRA-13304 follow-up: improve checksumming and compression in protocol v5-beta - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Normal
Resolution: Fixed
Fix Version/s: 4.0-beta4, 4.0
Component/s: Messaging/Client
Labels:
- protocolv5

Change Category:
Semantic
Complexity:
Normal
Platform:

All
Impacts:

Clients
Source Control Link:

https://github.com/apache/cassandra/commit/a7c4ba9eeecb365e7c4753d8eaab747edd9a632a
Test and Documentation Plan:
- Validate with java-driver test suites
- Improve coverage of client/server interaction in unit/in-jvm dtests
- Add burn tests
- Update v5 protocol spec

Description

~~CASSANDRA-13304~~ made an important improvement to our native protocol: it introduced checksumming/CRC32 to request and response bodies. It’s an important step forward, but it doesn’t cover the entire stream. In particular, the message header is not covered by a checksum or a crc, which poses a correctness issue if, for example, streamId gets corrupted.

Additionally, we aren’t quite using CRC32 correctly, in two ways:
1. We are calculating the CRC32 of the decompressed value instead of computing the CRC32 on the bytes written on the wire - losing the properties of the CRC32. In some cases, due to this sequencing, attempting to decompress a corrupt stream can cause a segfault by LZ4.
2. When using CRC32, the CRC32 value is written in the incorrect byte order, also losing some of the protections.

See https://users.ece.cmu.edu/~koopman/pubs/KoopmanCRCWebinar9May2012.pdf for explanation for the two points above.

Separately, there are some long-standing issues with the protocol - since way before ~~CASSANDRA-13304~~. Importantly, both checksumming and compression operate on individual message bodies rather than frames of multiple complete messages. In reality, this has several important additional downsides. To name a couple:

For compression, we are getting poor compression ratios for smaller messages - when operating on tiny sequences of bytes. In reality, for most small requests and responses we are discarding the compressed value as it’d be smaller than the uncompressed one - incurring both redundant allocations and compressions.
For checksumming and CRC32 we pay a high overhead price for small messages. 4 bytes extra is a lot for an empty write response, for example.

To address the correctness issue of streamId not being covered by the checksum/CRC32 and the inefficiency in compression and checksumming/CRC32, we should switch to a framing protocol with multiple messages in a single frame.

I suggest we reuse the framing protocol recently implemented for internode messaging in ~~CASSANDRA-15066~~ to the extent that its logic can be borrowed, and that we do it before native protocol v5 graduates from beta. See https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/net/FrameDecoderCrc.java and https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/net/FrameDecoderLZ4.java.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

Process CQL Frame.png
04/Nov/20 13:35
22 kB
Alex Petrov
V5 Flow Chart.png
04/Nov/20 13:35
73 kB
Alex Petrov

Issue Links

blocks

CASSANDRA-15313 Fix flaky - ChecksummingTransformerTest - org.apache.cassandra.transport.frame.checksum.ChecksummingTransformerTest

Resolved

is related to

CASSANDRA-15786 ChecksummingTransformerTest#corruptionCausesFailure fails for seed 43595190254702

Resolved

relates to

CASSANDRA-15556 When a LZ4 stream is corrupted it could cause the JVM to crash

Resolved

CASSANDRA-15313 Fix flaky - ChecksummingTransformerTest - org.apache.cassandra.transport.frame.checksum.ChecksummingTransformerTest

Resolved

CASSANDRA-16613 ProtocolVersion.V4 is still used in places in the code

Resolved

CASSANDRA-16663 Request-Based Native Transport Rate-Limiting

Resolved

CASSANDRA-16373 Deploy testing docker images to hub.docker.com/u/apache/

Resolved

links to

Java driver PR - 3.x

(2 relates to, 1 links to)

Activity

People

Assignee:: Sam Tunnicliffe

Reporter:: Aleksey Yeschenko

Authors:: Sam Tunnicliffe

Reviewers:: Alex Petrov, Caleb Rackliffe

Votes:: 1 Vote for this issue

Watchers:: 25 Start watching this issue

Dates

Created:: 02/Sep/19 13:45

Updated:: 16/Jun/21 17:26

Resolved:: 01/Dec/20 18:58