Cassandra
  1. Cassandra
  2. CASSANDRA-5428

CQL3 don't validate that collections haven't more than 64K elements

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Minor Minor
    • Resolution: Fixed
    • Fix Version/s: 1.2.13
    • Component/s: None
    • Labels:
      None

      Description

      This is somewhat similar to CASSANDRA-5355 but with a twist. When we serialize collections, not only does the size of the elements is limited to 64K, but the number of elements is too because it is also an unsigned short.

      Now the same argument than in CASSANDRA-5355 that collections are "places to denormalize small amounts of data" is true here too. So the fact that collections are limited to 64K elements is something I could live with. However, we don't validate that no more than 64K elements are inserted. And in fact, we can't validate it if the elements are added one by one.

      So in practice, you can insert more than 64K elements, but if you try to read it, you will only get back some subset of the collection. And the number of elements returned will correspond to the 2 last bytes of the real size (so a collection of 65536 elements will be returned as 1 element). Imo, that's more problematic.

      So since unfortunately we can't validate this at insertion, I suggest that as a first step we:

      1. document that limitation (in http://cassandra.apache.org/doc/cql3/CQL.html typically)
      2. when we read a collection that has > 64K elements, we detect it and when serializing that for the client, we:
        • return as much as we can, i.e. the 64K first ones
        • log a warning that something is wrong

      On the longer term, for 2.0, maybe we should just change the serialization format and use an int for the collection size, using an unsigned short was probably misguided. Of course that changes said serialization format so we have to bump the native protocol version for that (and thus can't do that in 1.2).

      1. 5428.txt
        4 kB
        Sylvain Lebresne

        Issue Links

          Activity

          Hide
          Jonathan Ellis added a comment -

          Sounds like a reasonable plan.

          Show
          Jonathan Ellis added a comment - Sounds like a reasonable plan.
          Hide
          Sylvain Lebresne added a comment -

          Attaching patch that does the part 2. above, detecting when we're about to serialize a collection with > 64K elements, logging an error and sending only the first 64K elements. On the documentation part, I've already ninja-committed a new paragraph to the doc for the limitations (no pushed online yet though).

          As already said above, this is obviously not perfect and next version of the native protocol we do, we should probably switch to an int to encode collection sizes.

          Show
          Sylvain Lebresne added a comment - Attaching patch that does the part 2. above, detecting when we're about to serialize a collection with > 64K elements, logging an error and sending only the first 64K elements. On the documentation part, I've already ninja-committed a new paragraph to the doc for the limitations (no pushed online yet though). As already said above, this is obviously not perfect and next version of the native protocol we do, we should probably switch to an int to encode collection sizes.
          Hide
          Sylvain Lebresne added a comment -

          First version of the patch was incorrect, attaching corrected version.

          Show
          Sylvain Lebresne added a comment - First version of the patch was incorrect, attaching corrected version.
          Hide
          Aleksey Yeschenko added a comment -

          +1

          Show
          Aleksey Yeschenko added a comment - +1
          Hide
          Sylvain Lebresne added a comment -

          Committed, thanks

          Show
          Sylvain Lebresne added a comment - Committed, thanks

            People

            • Assignee:
              Sylvain Lebresne
              Reporter:
              Sylvain Lebresne
              Reviewer:
              Aleksey Yeschenko
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development