Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-8043 Native Protocol V4
  3. CASSANDRA-7304

Ability to distinguish between NULL and UNSET values in Prepared Statements

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Normal
    • Resolution: Fixed
    • 2.2.0 beta 1
    • None
    • Clients

    Description

      Currently Cassandra inserts tombstones when a value of a column is bound to NULL in a prepared statement. At higher insert rates managing all these tombstones becomes an unnecessary overhead. This limits the usefulness of the prepared statements since developers have to either create multiple prepared statements (each with a different combination of column names, which at times is just unfeasible because of the sheer number of possible combinations) or fall back to using regular (non-prepared) statements.

      This JIRA is here to explore the possibility of either:
      A. Have a flag on prepared statements that once set, tells Cassandra to ignore null columns

      or

      B. Have an "UNSET" value which makes Cassandra skip the null columns and not tombstone them

      Basically, in the context of a prepared statement, a null value means delete, but we don’t have anything that means "ignore" (besides creating a new prepared statement without the ignored column).

      Please refer to the original conversation on DataStax Java Driver mailing list for more background:
      https://groups.google.com/a/lists.datastax.com/d/topic/java-driver-user/cHE3OOSIXBU/discussion

      EDIT 18/12/14 - odpeer Implementation Notes:

      The motivation hasn't changed.

      Protocol version 4 specifies that bind variables do not require having a value when executing a statement. Bind variables without a value are called 'unset'. The 'unset' bind variable is serialized as the int value '-2' without following bytes.

      • An unset bind variable in an EXECUTE or BATCH request
        • On a value does not modify the value and does not create a tombstone
        • On the ttl clause is treated as 'unlimited'
        • On the timestamp clause is treated as 'now'
        • On a map key or a list index throws InvalidRequestException
        • On a counter increment or decrement operation does not change the counter value, e.g. UPDATE my_tab SET c = c - ? WHERE k = 1 does change the value of counter c
        • On a tuple field or UDT field throws InvalidRequestException
      • An unset bind variable in a QUERY request
        • On a partition column, clustering column or index column in the WHERE clause throws InvalidRequestException
        • On the limit clause is treated as 'unlimited'

      Attachments

        1. 7304.patch
          19 kB
          Oded Peer
        2. 7304-03.patch
          45 kB
          Oded Peer
        3. 7304-04.patch
          68 kB
          Oded Peer
        4. 7304-05.patch
          67 kB
          Oded Peer
        5. 7304-06.patch
          79 kB
          Oded Peer
        6. 7304-07.patch
          73 kB
          Oded Peer
        7. 7304-2.patch
          7 kB
          Oded Peer
        8. 7304-V8.txt
          83 kB
          Benjamin Lerer

        Issue Links

          Activity

            People

              odpeer Oded Peer
              drew_kutchar Drew Kutcharian
              Oded Peer
              Benjamin Lerer
              Votes:
              5 Vote for this issue
              Watchers:
              20 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: