Uploaded image for project: 'Apache Cassandra'
  1. Apache Cassandra
  2. CASSANDRA-15811

Improve DROP COMPACT STORAGE

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Normal
    • Resolution: Later
    • 3.0.24, 3.11.10
    • Cluster/Schema
    • None
    • Correctness - Transient Incorrect Response
    • Normal
    • Normal
    • Adhoc Test
    • All
    • None

    Description

      DROP COMPACT STORAGE was introduced in CASSANDRA-10857 as one of the steps to deprecate Thrift. However, current semantics of dropping compact storage flags from tables reveal several columns that are usually empty (colum1 and value in non-dense case, value for dense columns, and a column with an empty name for super column families). Showing these columns can confuse application developers, especially ones that have never used thrift and/or made writes that assumed presence of those fields, and used compact storage in 3.x because is has “compact” in the name.

      There’s not much we can do in a super column family case, especially considering there’s no way to create a supercolumn family using CQL, but we can improve dense and non-dense cases. We can scan stables and make sure there are no signs of thrift writes in them, and if all sstables conform to this rule, we can not only drop the flag, but also drop columns that are supposed to be hidden. However, this is both not very user-friendly, and is probably not worth development effort.

      An alternative to scanning is to add FORCE DROP COMPACT syntax (or something similar) that would just drop columns unconditionally. It is likely that people who were using compact storage with thrift know they were doing that, so they'll usually use "regular" DROP COMPACT, withouot force, that will simply reveal the columns as it does right now.

      Since for fixing CASSANDRA-15778, and to allow EmptyType column to actually have data[*] we had to remove empty type validation, properly handling compact storage starts making more sense, but we’ll solve it through not having columns, hence not caring about values instead, or keeping values and data, not requiring validation in this case. EmptyType field will have to be handled differently though.

      [*] as it is possible to end up with sstables upgraded from 2.x or written in 3.x before CASSANDRA-15373, which means not every 2.x upgraded or 3.x cluster is guaranteed to have empty values in this column, and this behaviour, even if undesired, might be used by people.

      Open question is: CASSANDRA-15373 adds validation to EmptyType that disallows any non-empty value to be written to it, but we already allow creating table via CQL, and still write data into it with thrift. It seems to have been unintended, but it might have become a feature people rely on. If we simply back port 15373 to 2.2 and 2.1, we’ll change and break behaviour. Given no-one complained in 3.0 and 3.11, this assumption is unlikely though.

      Attachments

        Issue Links

          Activity

            People

              marcuse Marcus Eriksson
              ifesdjeen Alex Petrov
              Marcus Eriksson
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: