Details
-
New Feature
-
Status: Resolved
-
Normal
-
Resolution: Fixed
-
None
-
Semantic
-
Normal
-
All
-
None
-
Description
The dev list thread regarding CQL transaction support seems to have converged on a syntax.
The thread is here: https://lists.apache.org/thread/5sds3968mnnk42c24pvgwphg4qvo2xk0
The message proposing the syntax ultimately agreed on is here: https://lists.apache.org/thread/y289tczngj68bqpoo7gkso3bzmtf86pl
I'll describe my understanding of the agreed syntax here for, but I'd suggest reading through the thread.
The example query is this:
BEGIN TRANSACTION LET car = (SELECT miles_driven, is_running FROM cars WHERE model=’pinto’); LET user = (SELECT miles_driven FROM users WHERE name=’Blake’); SELECT car.is_running, car.miles_driven; IF car.is_running THEN UPDATE users SET miles_driven = user.miles_driven + 30 WHERE name='blake'; UPDATE cars SET miles_driven = car.miles_driven + 30 WHERE model='pinto'; END IF COMMIT TRANSACTION
Sections are described below, and we want to require the statement enforces an order on the different types of clauses. First, assignments, then select(s), then conditional updates. This may be relaxed in the future, but is meant to prevent users from interleaving updates and assignments and expecting read your own write behavior that we don't want to support in v1.
Reference assignments
LET car = (SELECT miles_driven, is_running FROM cars WHERE model=’pinto’)
The first part is basically assigning the result of a SELECT statement to a named reference that can be used in updates/conditions and be returned to the user. Tuple names belong to a global scope and must not clash with other LET statements or update statements (more on that in the updates section). Also, the select statement must either be a point read, with all primary key columns defined, or explicitly set a limit of 1.
Selection
Data to returned to client. Currently, we're only supporting a single select statement. Either a normal select statement, or one returning values assigned by LET statements as shown in the example. Ultimately we'll want to support multiple select statements and returning the results to the client. Although that will require a protocol change.
Updates
Normal inserts/updates/deletes with the following extensions:
- Inserts and updates can reference values assigned earlier in the statement
- Updates can reference their own columns:
miles_driven = miles_driven + 30
- or -
miles_driven += 30
These will of course need to generate any required reads behind the scenes. There's no precedence of table vs reference names, so if a relevant column name and reference name conflict, the query needs to fail to parse.
If blocks
IF <some conditions> THEN <some update>; <another update>; END IF
For v1, we only need to support a single condition in the format above. In the future, we'd like to support multiple branches with multiple conditions like:
IF <some conditions> THEN <some update>; <another update>; ELSE IF <another condition> THEN <yet another update>; ELSE <an update>; END IF
Conditions
Comparisons of value references to literals or other references. IS NULL / IS NOT NULL also needs to be supported. Multiple comparisons need to be supported, but v1 only needs to support AND'ing them together.
Supported operators: =, !=, >, >=, <, <=, IS NULL, IS NOT NULL <value_reference> = 5 <value_reference> IS NOT NULL IF car IS NOT NULL AND car.miles_driven > 30 IF car.miles_driven = user.miles_driven
(Note that IS[ NOT ]NULL can apply to both tuple references and individually dereferenced columns.)
CONTAINS and CONTAINS KEY require indexed collections, and will not be supported in v1.
Implementation notes
I have a proof of concept I'd created to demo the initially proposed syntax here: https://github.com/bdeggleston/cassandra/tree/accord-cql-poc, It could be used as a starting point, a source of ideas for approaches, or not used at all. The main thing to keep in mind is that value references prevent creating read commands and mutations until later in the transaction process, and potentially on another machine, which means we can't create accord transactions with fully formed mutations and read commands. CQL statements and terms are also not serializable, and would require a ton of work to properly serialize, so there will need to be some intermediate stage that can be serialized.
Attachments
Issue Links
- relates to
-
CASSANDRA-17762 LWT IF col = NULL is inconsistent with SQL NULL
- Open
-
CASSANDRA-17104 CEP-15: (C*/Accord) Transaction state storage
- Patch Available
- split to
-
CASSANDRA-18107 CEP-15: Multi-Partition Transaction CQL Support (Alpha -> v1)
- Open