Details

    • Type: Sub-task Sub-task
    • Status: Resolved
    • Priority: Minor Minor
    • Resolution: Fixed
    • Fix Version/s: 1.2.4
    • Component/s: API
    • Labels:

      Description

      Dense composite supports adding records where only a prefix of all the component specifying the key is defined. In other words, with:

      CREATE TABLE connections (
         userid int,
         ip text,
         port int,
         protocol text,
         time timestamp,
         PRIMARY KEY (userid, ip, port, protocol)
      ) WITH COMPACT STORAGE
      

      you can insert

      INSERT INTO connections (userid, ip, port, time) VALUES (2, '192.168.0.1', 80, 123456789);
      

      You cannot however select that column specifically (i.e, without selecting column (2, '192.168.0.1', 80, 'http') for instance).
      This ticket proposes to allow that though 'null', i.e. to allow

      SELECT * FROM connections WHERE userid = 2 AND ip = '192.168.0.1' AND port = 80 AND protocol = null;
      

      It would then also make sense to support:

      INSERT INTO connections (userid, ip, port, protocol, time) VALUES (2, '192.168.0.1', 80, null, 123456789);
      

      as an equivalent to the insert query above.

      1. 3783-wip-v1.patch
        2 kB
        Michał Michalski
      2. 3783-v2.patch
        11 kB
        Michał Michalski
      3. 3783-v3.patch
        11 kB
        Michał Michalski
      4. 3783-v4.txt
        12 kB
        Sylvain Lebresne
      5. 3783-v5.txt
        11 kB
        Aleksey Yeschenko

        Issue Links

          Activity

          Aleksey Yeschenko made changes -
          Link This issue is duplicated by CASSANDRA-5416 [ CASSANDRA-5416 ]
          Hide
          Aleksey Yeschenko added a comment -

          There isn't, and inserting nulls via both INSERT and UPDATE works fine for me, prepared and not, cassandra-1.2.4 and 1.2.5.

          Is your Cassandra cluster at 1.2.4+? If yes, then it must be a cass-jdbc issue. Do the same queries work fine in cqlsh?

          Show
          Aleksey Yeschenko added a comment - There isn't, and inserting nulls via both INSERT and UPDATE works fine for me, prepared and not, cassandra-1.2.4 and 1.2.5. Is your Cassandra cluster at 1.2.4+? If yes, then it must be a cass-jdbc issue. Do the same queries work fine in cqlsh?
          Hide
          Constance Eustace added a comment -

          I am using cass-jdbc 1.2.5 and have done several UPDATE and INSERT statements using NULL or null for literal (non-prepared) statements which do not parse, while using Types.NULL in a prepared statement or a null string + Types.VARCHAR in a prepared statement result in the insert/update of an empty string as visible by cqlsh...

          This is marked as fixed as of 1.2.4...

          Is there a new Keyword for INSERT/UPDATE of NULL?

          Show
          Constance Eustace added a comment - I am using cass-jdbc 1.2.5 and have done several UPDATE and INSERT statements using NULL or null for literal (non-prepared) statements which do not parse, while using Types.NULL in a prepared statement or a null string + Types.VARCHAR in a prepared statement result in the insert/update of an empty string as visible by cqlsh... This is marked as fixed as of 1.2.4... Is there a new Keyword for INSERT/UPDATE of NULL?
          Hide
          Sylvain Lebresne added a comment -

          Sorry, I didn't saw your v5 to be honest, I made the change directly. But on the plus side, my fix is probably better because your v5 would write a "null is not supported inside collections" for nested collections I believe.

          Show
          Sylvain Lebresne added a comment - Sorry, I didn't saw your v5 to be honest, I made the change directly. But on the plus side, my fix is probably better because your v5 would write a "null is not supported inside collections" for nested collections I believe.
          Sylvain Lebresne made changes -
          Status Open [ 1 ] Resolved [ 5 ]
          Resolution Fixed [ 1 ]
          Hide
          Sylvain Lebresne added a comment -

          Alright then. Committed with the fix to the error message. Thanks

          Show
          Sylvain Lebresne added a comment - Alright then. Committed with the fix to the error message. Thanks
          Aleksey Yeschenko made changes -
          Attachment 3783-v5.txt [ 12577004 ]
          Hide
          Aleksey Yeschenko added a comment - - edited

          Attached v5 with fixed null-in-collection-literals validation.

          Show
          Aleksey Yeschenko added a comment - - edited Attached v5 with fixed null-in-collection-literals validation.
          Aleksey Yeschenko made changes -
          Reviewer iamaleksey
          Brandon Williams made changes -
          Status Testing [ 10012 ] Open [ 1 ]
          made changes -
          Status Patch Available [ 10002 ] Testing [ 10012 ]
          Sylvain Lebresne made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Sylvain Lebresne made changes -
          Attachment 3783-v4.txt [ 12574797 ]
          Hide
          Sylvain Lebresne added a comment -

          Sorry I wasn't clear, what I meant earlier is that CASSANDRA-5081 already has the tests needed to handle null. Typically, the test the patch adds to Operation is already in Constants.Setter and related collection methods (the goal being to have the same code to handle prepared and non-prepared statement).

          So attaching a v4 that reuse the code introduced by CASSANDRA-5081. It makes null a separate object rather than another constant because that felt slightly cleaner/simpler). Also, in the parser, null is a 'value' instead of 'constant': the difference is minor but it avoid having to deal with null in CREATE/ALTER property values (since they are 'constant').

          The patch also validate that null values are not used inside a collection and simplify/fix a few collections methods.

          Show
          Sylvain Lebresne added a comment - Sorry I wasn't clear, what I meant earlier is that CASSANDRA-5081 already has the tests needed to handle null. Typically, the test the patch adds to Operation is already in Constants.Setter and related collection methods (the goal being to have the same code to handle prepared and non-prepared statement). So attaching a v4 that reuse the code introduced by CASSANDRA-5081 . It makes null a separate object rather than another constant because that felt slightly cleaner/simpler). Also, in the parser, null is a 'value' instead of 'constant': the difference is minor but it avoid having to deal with null in CREATE/ALTER property values (since they are 'constant'). The patch also validate that null values are not used inside a collection and simplify/fix a few collections methods.
          Sylvain Lebresne made changes -
          Fix Version/s 1.2.4 [ 12324157 ]
          Fix Version/s 1.2.3 [ 12324089 ]
          Michał Michalski made changes -
          Attachment 3783-v3.patch [ 12572138 ]
          Hide
          Michał Michalski added a comment -

          No problem, Sylvain
          Attaching rebased patch.

          Show
          Michał Michalski added a comment - No problem, Sylvain Attaching rebased patch.
          Hide
          Sylvain Lebresne added a comment -

          As for the attached patch, I'm sorry I haven't had time to get to it yet, but I committed CASSSANDRA-5081, and I'm pretty sure some rebase has to happen.

          Show
          Sylvain Lebresne added a comment - As for the attached patch, I'm sorry I haven't had time to get to it yet, but I committed CASSSANDRA-5081, and I'm pretty sure some rebase has to happen.
          Hide
          Sylvain Lebresne added a comment -

          Could the "insert null" command behaves like a batch mutation ?

          I'm not really sure what you mean here. For normal columns (the only ones we should care about in this ticket really), setting to null really mean deleting the column. There is no difficulty in doing that. But for collections, if there is no column internally, there is no element (even for lists, the internal column representing each element does not include the element index; that's why operation like 'setting by index' needs to read the full list, so it can find the element at the requested index by counting them), so we do need to have a column, but his "value" should mean "null".

          Show
          Sylvain Lebresne added a comment - Could the "insert null" command behaves like a batch mutation ? I'm not really sure what you mean here. For normal columns (the only ones we should care about in this ticket really), setting to null really mean deleting the column. There is no difficulty in doing that. But for collections, if there is no column internally, there is no element (even for lists, the internal column representing each element does not include the element index; that's why operation like 'setting by index' needs to read the full list, so it can find the element at the requested index by counting them), so we do need to have a column, but his "value" should mean "null".
          Hide
          Pierre Chalamet added a comment -

          Could the "insert null" command behaves like a batch mutation ? (Speaking without knowing the code base anyway - sorry if this is totally dumb question). I mean, it just need to coordinate (à la batch_mutate) insert (non null values) and deletion (null values).

          Could it be possible to split "insert with null" into 2 sub categories (insertion and deletion) and just delegate to the internal batch_mutate handler ? This way, we do not care to have an internal representation for null values. We just need the atomic mutation for insert/delete on same row, but this was done before.

          Show
          Pierre Chalamet added a comment - Could the "insert null" command behaves like a batch mutation ? (Speaking without knowing the code base anyway - sorry if this is totally dumb question). I mean, it just need to coordinate (à la batch_mutate) insert (non null values) and deletion (null values). Could it be possible to split "insert with null" into 2 sub categories (insertion and deletion) and just delegate to the internal batch_mutate handler ? This way, we do not care to have an internal representation for null values. We just need the atomic mutation for insert/delete on same row, but this was done before.
          Hide
          Sylvain Lebresne added a comment -

          It would also be convenient if CQL3 supported null values for collections.

          I'm afraid this is not so simple I'm afraid. Doing that would basically require having an internal representation of null for every type (because in that case, we would be talking of actually storing a null value, not just deleting a column). And we don't have an easy candidate for that internal representation currently. Meaning we would have to change a bit the internal representation of collection columns (say, add some flags somewhere in the column name that say "this is a null element"). But I'm not convinced we can do that without breaking backward compatibility really. In any case, this is a very special case so we should handle that in a followup ticket if we do handle. But again, not sure how doable that is in practice.

          Say I have
          ALTER TABLE metrics ADD valueBySecond list<double>;

          Just for the record, this sound like a bad use of a list. Because valueBySecond sound like something that by design will grow fairly big. Remember that currently you cannot fetch parts of a collection. And even when we allow fetching part of a collection, we'll only be able to do it for maps and sets, not for lists. This is typically a case where you'd rather want to have a composite PK.

          Show
          Sylvain Lebresne added a comment - It would also be convenient if CQL3 supported null values for collections. I'm afraid this is not so simple I'm afraid. Doing that would basically require having an internal representation of null for every type (because in that case, we would be talking of actually storing a null value, not just deleting a column). And we don't have an easy candidate for that internal representation currently. Meaning we would have to change a bit the internal representation of collection columns (say, add some flags somewhere in the column name that say "this is a null element"). But I'm not convinced we can do that without breaking backward compatibility really. In any case, this is a very special case so we should handle that in a followup ticket if we do handle. But again, not sure how doable that is in practice. Say I have ALTER TABLE metrics ADD valueBySecond list<double>; Just for the record, this sound like a bad use of a list. Because valueBySecond sound like something that by design will grow fairly big. Remember that currently you cannot fetch parts of a collection. And even when we allow fetching part of a collection, we'll only be able to do it for maps and sets, not for lists. This is typically a case where you'd rather want to have a composite PK.
          Hide
          Joachim Haagen Skeie added a comment -

          It would also be convenient if CQL3 supported null values for collections.

          Say I have
          ALTER TABLE metrics ADD valueBySecond list<double>;

          And I want to insert metric values into the table, some data-point could be null if no measurement for that time period exists. It would be great if CQL3 supported doing:

          INSERT into metrics (id, valueBySecond ) values ( 'id', [1.2, 3,4, null, 4.5, null]);

          Show
          Joachim Haagen Skeie added a comment - It would also be convenient if CQL3 supported null values for collections. Say I have ALTER TABLE metrics ADD valueBySecond list<double>; And I want to insert metric values into the table, some data-point could be null if no measurement for that time period exists. It would be great if CQL3 supported doing: INSERT into metrics (id, valueBySecond ) values ( 'id', [1.2, 3,4, null, 4.5, null] );
          Sylvain Lebresne made changes -
          Fix Version/s 1.2.3 [ 12324089 ]
          Fix Version/s 1.2.2 [ 12323924 ]
          Michał Michalski made changes -
          Assignee Michał Michalski [ michalm ]
          Michał Michalski made changes -
          Attachment 3783-v2.patch [ 12570097 ]
          Hide
          Michał Michalski added a comment - - edited

          OK, here's the patch. In general I've added possibility to check TYPE of classes implementing Term.Raw (more precise: checking if they're NULLs) and use it in Operations classes implementing RawUpdate (in one of these classes, actually). If Term Type is NULL, return Constants.Deleter Operation when preparing operation.

          Quick explaination:

          Why, among all the classes containing nested Literal classes implementing Term.Raw, only one checks for Term Type and rest of them return false by default?

          • Constants: it checks the Type because, obviously, this is the main null-value use case
          • Sets, Lists, Maps: If one of them is set to null it's matched (by parser) as a null-based Constant (which seems to be fine from my point of view, doesn't it?), so none of them can be of type null itself
          • TypeCast, FunctionCall, AbstractMarker: it just doesn't make sense for them to be null

          Why only SetValue (among other RawUpdate interfaces) checks if Term is NULL and returns Deleter if yes:

          • RawUpdate:
            SetValue: obviously, this was the main use case here
            SetElement: setting collections' values to null might make sense, but do we want it?
            Addition, Substraction, Prepend: adding null to existing column's contents etc. doesn't make sense for me
          • RawDeletion: If we do not support nulls in SELECTs because of the indexing reasons, I guess it can't be done in DELETEs too

          Additionally I renamed NULL to K_NULL, as suggested.

          Show
          Michał Michalski added a comment - - edited OK, here's the patch. In general I've added possibility to check TYPE of classes implementing Term.Raw (more precise: checking if they're NULLs) and use it in Operations classes implementing RawUpdate (in one of these classes, actually). If Term Type is NULL, return Constants.Deleter Operation when preparing operation. Quick explaination: Why, among all the classes containing nested Literal classes implementing Term.Raw, only one checks for Term Type and rest of them return false by default? Constants: it checks the Type because, obviously, this is the main null-value use case Sets, Lists, Maps: If one of them is set to null it's matched (by parser) as a null-based Constant (which seems to be fine from my point of view, doesn't it?), so none of them can be of type null itself TypeCast, FunctionCall, AbstractMarker: it just doesn't make sense for them to be null Why only SetValue (among other RawUpdate interfaces) checks if Term is NULL and returns Deleter if yes: RawUpdate: SetValue: obviously, this was the main use case here SetElement: setting collections' values to null might make sense, but do we want it? Addition, Substraction, Prepend: adding null to existing column's contents etc. doesn't make sense for me RawDeletion: If we do not support nulls in SELECTs because of the indexing reasons, I guess it can't be done in DELETEs too Additionally I renamed NULL to K_NULL, as suggested.
          Hide
          Michał Michalski added a comment -

          Thanks for explainations, Sylvain. I think I understand the general idea, but I'm not sure if I understood the implementation details you mentioned in 100% as I'm still "discovering" C* codebase, so I'll try to push this task forward a bit and post an updated patch in a day or two, asking for your quick review

          Show
          Michał Michalski added a comment - Thanks for explainations, Sylvain. I think I understand the general idea, but I'm not sure if I understood the implementation details you mentioned in 100% as I'm still "discovering" C* codebase, so I'll try to push this task forward a bit and post an updated patch in a day or two, asking for your quick review
          Hide
          Sylvain Lebresne added a comment -

          Will this be supported for prepared statement as well ?

          Yes, there is even CASSANDRA-5081 opened for that specific part.

          Show
          Sylvain Lebresne added a comment - Will this be supported for prepared statement as well ? Yes, there is even CASSANDRA-5081 opened for that specific part.
          Hide
          Pierre Chalamet added a comment -

          Will this be supported for prepared statement as well ?

          Show
          Pierre Chalamet added a comment - Will this be supported for prepared statement as well ?
          Hide
          Sylvain Lebresne added a comment -

          Right, sorry.

          Show
          Sylvain Lebresne added a comment - Right, sorry.
          Hide
          Jonathan Ellis added a comment -

          The intent, as far as inserts/updates are concerned is that setting the column to null is equivalent to not setting it at all

          More precisely, it's equivalent to deleting it.

          Show
          Jonathan Ellis added a comment - The intent, as far as inserts/updates are concerned is that setting the column to null is equivalent to not setting it at all More precisely, it's equivalent to deleting it.
          Hide
          Sylvain Lebresne added a comment -

          As far as the parser go this is fine (except that since null is a keyword, I'd rather call the token K_NULL for consistency sake).

          But otherwise I don't think treating null as the empty string is a good idea. As you say it doesn't work for strings, but even for other types, inserting an empty byte buffer is not the intent (and in fact this would probably break some type of querying since the empty byte buffer is kind of reserved and special cased in query slices). The intent, as far as inserts/updates are concerned is that setting the column to null is equivalent to not setting it at all, it's just a convenience. Which means that we probably need special handling in the Operation classes (we'll need that special treatment for handling null in prepared statement anyway (CASSANDRA-5081, which is probably the most useful part of supporting nulls)).

          For selects however, null is actually a constraint on the query. For that, we may want/need to update the 2ndary index code to handle that. But maybe that part could be done separatly in fact.

          I do want to note that the example in the description of this ticket is kind of bad. In theory, a PRIMARY KEY cannot contain a null value by definition and currently we don't allow them in general. The example in the description is one exception (compact + composite) where we could indeed allow some nulls in the PK, but since it is a special case, maybe we can leave that to a follow up ticket.

          Show
          Sylvain Lebresne added a comment - As far as the parser go this is fine (except that since null is a keyword, I'd rather call the token K_NULL for consistency sake). But otherwise I don't think treating null as the empty string is a good idea. As you say it doesn't work for strings, but even for other types, inserting an empty byte buffer is not the intent (and in fact this would probably break some type of querying since the empty byte buffer is kind of reserved and special cased in query slices). The intent, as far as inserts/updates are concerned is that setting the column to null is equivalent to not setting it at all, it's just a convenience. Which means that we probably need special handling in the Operation classes (we'll need that special treatment for handling null in prepared statement anyway ( CASSANDRA-5081 , which is probably the most useful part of supporting nulls)). For selects however, null is actually a constraint on the query. For that, we may want/need to update the 2ndary index code to handle that. But maybe that part could be done separatly in fact. I do want to note that the example in the description of this ticket is kind of bad. In theory, a PRIMARY KEY cannot contain a null value by definition and currently we don't allow them in general. The example in the description is one exception (compact + composite) where we could indeed allow some nulls in the PK, but since it is a special case, maybe we can leave that to a follow up ticket.
          Michał Michalski made changes -
          Attachment 3783-wip-v1.patch [ 12568760 ]
          Hide
          Michał Michalski added a comment -

          I started to investigate this problem a bit and I easily came to the point where I can pass null in cql3sh - see attached work-in-progress patch. It even ends up with proper INSERT ("proper" means that it shows up as "red colored" null in cqlsh when performing SELECT ) for all fields of Type that returns EMPTY_BYTE_BUFFER in fromString method, as I pass "" to Literal constructor. However, this simply cannot work with strings, as "" will be saved as "". Anyway, before I try to push this problem further, I have some questions:

          1. Is this (see attachment), in general, a good way to go with this task? Or maybe I'm completely wrong with this approach?

          2. How about "string-based" Types? We obviously can't return EMPTY_BYTE_BUFFER for "" because it's a valid string, so I was thinking about a solution that will handle literal 'null' value as a second param for Literal class constructor, which could be then handled (by simple 'if') in all db.marshall types to return EMPTY_BYTE_BUFFER. It doesn't look like a "perfect" solution to me, but it looks quite clean and... well... it's the best idea I had so far

          Any other thoughts on this?

          Show
          Michał Michalski added a comment - I started to investigate this problem a bit and I easily came to the point where I can pass null in cql3sh - see attached work-in-progress patch. It even ends up with proper INSERT ("proper" means that it shows up as "red colored" null in cqlsh when performing SELECT ) for all fields of Type that returns EMPTY_BYTE_BUFFER in fromString method, as I pass "" to Literal constructor. However, this simply cannot work with strings, as "" will be saved as "". Anyway, before I try to push this problem further, I have some questions: 1. Is this (see attachment), in general, a good way to go with this task? Or maybe I'm completely wrong with this approach? 2. How about "string-based" Types? We obviously can't return EMPTY_BYTE_BUFFER for "" because it's a valid string, so I was thinking about a solution that will handle literal 'null' value as a second param for Literal class constructor, which could be then handled (by simple 'if') in all db.marshall types to return EMPTY_BYTE_BUFFER. It doesn't look like a "perfect" solution to me, but it looks quite clean and... well... it's the best idea I had so far Any other thoughts on this?
          Gavin made changes -
          Workflow patch-available, re-open possible [ 12753109 ] reopen-resolved, no closed status, patch-avail, testing [ 12755761 ]
          Gavin made changes -
          Workflow no-reopen-closed, patch-avail [ 12650521 ] patch-available, re-open possible [ 12753109 ]
          Jonathan Ellis made changes -
          Fix Version/s 1.2.2 [ 12323924 ]
          Fix Version/s 1.2.1 [ 12322953 ]
          Jonathan Ellis made changes -
          Fix Version/s 1.2.1 [ 12322953 ]
          Fix Version/s 1.2.0 [ 12319262 ]
          Jonathan Ellis made changes -
          Fix Version/s 1.2 [ 12319262 ]
          Fix Version/s 1.1.3 [ 12321881 ]
          Sylvain Lebresne made changes -
          Fix Version/s 1.1.3 [ 12321881 ]
          Fix Version/s 1.1.2 [ 12321445 ]
          Jonathan Ellis made changes -
          Fix Version/s 1.1.2 [ 12321445 ]
          Fix Version/s 1.1.1 [ 12319857 ]
          Sylvain Lebresne made changes -
          Fix Version/s 1.1.1 [ 12319857 ]
          Fix Version/s 1.1 [ 12317615 ]
          Sylvain Lebresne made changes -
          Component/s API [ 12313742 ]
          Sylvain Lebresne made changes -
          Field Original Value New Value
          Fix Version/s 1.1 [ 12317615 ]
          Sylvain Lebresne created issue -

            People

            • Assignee:
              Michał Michalski
              Reporter:
              Sylvain Lebresne
              Reviewer:
              Aleksey Yeschenko
            • Votes:
              4 Vote for this issue
              Watchers:
              18 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development