Details
-
New Feature
-
Status: Open
-
Normal
-
Resolution: Unresolved
-
None
-
None
Description
This is a proposal for a new feature, mapping custom types to Cassandra columns.
These types would provide a creation function and a merge function, to be implemented in Java by the user.
This feature relates to the concept of CRDTs; the proposal is to replicate "operations" on these types during write, to apply these operations internally during merge (Column.reconcile), and to also merge their values on read.
The following operations are made possible without reading back any data:
- MIN or MAX(value) for a column
- First value for a column
- Count Distinct
- HyperLogLog
- Count-Min
And any composition of these too, e.g. a Candlestick type includes first, last, min, and max.
The merge operations exposed by these types need to be commutative; this is the case for many functions used in analytics.
This feature is incomplete without some integration with CASSANDRA-4775 (Counters 2.0) which provides a Read-Modify-Write implementation for distributed counters. Integrating custom creation and merge functions with new counters would let users implement complex CRDTs in Cassandra, including:
- Averages & related (sum of squares, standard deviation)
- Graphs
- Sets
- Custom registers (even with vector clocks)
I have a working prototype with implementations for min, max, and Candlestick at https://github.com/acunu/cassandra/tree/crdts - I'd appreciate any feedback on the design and interfaces.
Attachments
Issue Links
- contains
-
CASSANDRA-7297 semi-immutable CQL rows
- Open
- requires
-
CASSANDRA-7423 Allow updating individual subfields of UDT
- Resolved
-
CASSANDRA-8099 Refactor and modernize the storage engine
- Resolved