[CASSANDRA-2540] Data reads by default - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Resolved
Priority: Low
Resolution: Won't Fix
Fix Version/s: None
Component/s: None
Labels:
None

Description

The intention of digest vs data reads is to save bandwidth in the read path at the cost of latency, but I expect that this has been a premature optimization.

Data requested by a read will often be within an order of magnitude of the digest size, and a failed digest means extra roundtrips, more bandwidth
The digest reads but not your data read problem means failing QUORUM reads because a single node is unavailable, and would require eagerly re-requesting at some fraction of your timeout
Saving bandwidth in cross datacenter usecases comes at huge cost to latency, but since both constraints change proportionally (enough), the tradeoff is not clear

Some options:

Add an option to use digest reads
Remove digest reads entirely (and/or punt and make them a runtime optimization based on data size in the future)
Continue to use digest reads, but send them to N - R nodes for (somewhat) more predicatable behavior with QUORUM

The outcome of data-reads-by-default should be significantly improved latency, with a moderate increase in bandwidth usage for large reads.

Attachments

Activity

People

Assignee:: Peter Schuller

Reporter:: Stu Hood

Authors:: Peter Schuller

Votes:: 2 Vote for this issue

Watchers:: 5 Start watching this issue

Dates

Created:: 22/Apr/11 05:53

Updated:: 16/Apr/19 09:33

Resolved:: 24/Sep/12 13:57