Details

    • Type: New Feature New Feature
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      These patches implement removing columns, supercolumns, or columnfamilies for a given key.

      1. remove.zip
        53 kB
        Jonathan Ellis

        Activity

        Hide
        Jonathan Ellis added a comment -

        This patch makes changes that make remove support easier or possible:

        Column:

        • add boolean isMarkedForDelete. If true, the timestamp field represents the deletion time
        • all fields are final (immutable). This avoids the need for Atomic* variables and makes whole classes of bugs impossible

        SuperColumn:

        • removed boolean isMarkedForDelete
        • long markedForDeleteAt added. If greater than MIN_VALUE, it is considered deleted at the given time
        • putColumn() and repair() combined; renamed to integrate()

        ColumnFamily:

        • long markedForDeleteAt added, as in SuperColumn
        • isSuper() convenience method added
        • addColumn and createColumn methods combined; all are now overloads of addColumn. Note that addColumn(name, column) was removed in favor of simply addColumn(column) since the column already knows its name, and allowing a different one to be specified could result in hard-to-find bugs
        • serializer always dumps + loads the Columns; trying to optimize by leaving them out causes bugs with remove
        • renamed serializer2 to serializerWithIndexes
        • renamed getColumnFamilies to getColumnFamilyMap. Added getColumnFamilies method returning only the CF collection (the map values).

        Memtable:

        • added SuperColumn support to forceFlush. Refactored flush methods slightly so that the only one who cares about fRecovery is Table. [everyone else just passed False.]

        NamesFilter:

        • makes a copy of the List it is passed. This fixes a bug that may not be specific to remove support.

        Row:

        • merge() removed (duplicate of Repair)

        RowMutation:

        • added makeRowMutationMessage()
        • added sanity checks to add()
        • added delete(columnFamilyColumn, timestamp) method
        • cleaned up duplicate code in apply() overloads

        Message:

        • Changed constructor from Object[] body to Object... body. This allows (but does not require) single Objects to be passed without explicitly wrapping in a new Object[] {}.

        General:

        • old-style remove/delete support removed, since it's going to be rewritten in the next patch

        The other changes are just dealing with the consequences of the above, particularly the getColumnFamilyMap rename and the CF.addColumn parameter change.

        Show
        Jonathan Ellis added a comment - This patch makes changes that make remove support easier or possible: Column: add boolean isMarkedForDelete. If true, the timestamp field represents the deletion time all fields are final (immutable). This avoids the need for Atomic* variables and makes whole classes of bugs impossible SuperColumn: removed boolean isMarkedForDelete long markedForDeleteAt added. If greater than MIN_VALUE, it is considered deleted at the given time putColumn() and repair() combined; renamed to integrate() ColumnFamily: long markedForDeleteAt added, as in SuperColumn isSuper() convenience method added addColumn and createColumn methods combined; all are now overloads of addColumn. Note that addColumn(name, column) was removed in favor of simply addColumn(column) since the column already knows its name, and allowing a different one to be specified could result in hard-to-find bugs serializer always dumps + loads the Columns; trying to optimize by leaving them out causes bugs with remove renamed serializer2 to serializerWithIndexes renamed getColumnFamilies to getColumnFamilyMap. Added getColumnFamilies method returning only the CF collection (the map values). Memtable: added SuperColumn support to forceFlush. Refactored flush methods slightly so that the only one who cares about fRecovery is Table. [everyone else just passed False.] NamesFilter: makes a copy of the List it is passed. This fixes a bug that may not be specific to remove support. Row: merge() removed (duplicate of Repair) RowMutation: added makeRowMutationMessage() added sanity checks to add() added delete(columnFamilyColumn, timestamp) method cleaned up duplicate code in apply() overloads Message: Changed constructor from Object[] body to Object... body. This allows (but does not require) single Objects to be passed without explicitly wrapping in a new Object[] {}. General: old-style remove/delete support removed, since it's going to be rewritten in the next patch The other changes are just dealing with the consequences of the above, particularly the getColumnFamilyMap rename and the CF.addColumn parameter change.
        Hide
        Jonathan Ellis added a comment -

        This patch provides the actual remove support internally. Thrift API support is not yet included.

        ColumnComparatorFactory:

        • fix exception when comparing two SuperColumns with the same name

        ColumnFamilyStore:

        • Split resolve() into resolve(), which combines ColumnFamilies, and removeDeleted(), which takes a single ColumnFamily and returns a new one with deleted IColumns removed. Keep deletion information around until removeDeleted is called so that deletion information can properly supress older IColumns.

        RowMutationVerbHandler:

        • send response back so blocking calls can work

        WriteResponseMessage:

        • Renamed to WriteResponse to avoid confusion with Message class

        StorageProxy:

        • added insertBlocking method for use by batch_insert_blocking, batch_insert_superColumn_blocking, and remove in blocking mode.

        CassandraServer:

        • added remove(String, String, String, long, int). Thrift needs to be modified to expose this and not the old remove (which is left in as a stub to keep the build happy).
        Show
        Jonathan Ellis added a comment - This patch provides the actual remove support internally. Thrift API support is not yet included. ColumnComparatorFactory: fix exception when comparing two SuperColumns with the same name ColumnFamilyStore: Split resolve() into resolve(), which combines ColumnFamilies, and removeDeleted(), which takes a single ColumnFamily and returns a new one with deleted IColumns removed. Keep deletion information around until removeDeleted is called so that deletion information can properly supress older IColumns. RowMutationVerbHandler: send response back so blocking calls can work WriteResponseMessage: Renamed to WriteResponse to avoid confusion with Message class StorageProxy: added insertBlocking method for use by batch_insert_blocking, batch_insert_superColumn_blocking, and remove in blocking mode. CassandraServer: added remove(String, String, String, long, int). Thrift needs to be modified to expose this and not the old remove (which is left in as a stub to keep the build happy).
        Hide
        Jonathan Ellis added a comment -

        This fixes the CF deserialization in SequenceFile to know about the format change (boolean -> long).

        Show
        Jonathan Ellis added a comment - This fixes the CF deserialization in SequenceFile to know about the format change (boolean -> long).
        Hide
        Jonathan Ellis added a comment -

        I've updated my patches to apply against current trunk and split into bite-sized pieces. Each piece corresponds to one of the steps in the larger patches described above. (Full description is in a Subject: line in the header for each patch.)

        Show
        Jonathan Ellis added a comment - I've updated my patches to apply against current trunk and split into bite-sized pieces. Each piece corresponds to one of the steps in the larger patches described above. (Full description is in a Subject: line in the header for each patch.)
        Hide
        Avinash Lakshman added a comment -

        No. You cannot free up memory. It will be get garbage collected once they are no longer actively referenced which will be the case. Setting it to NULL (which is what the clear() does) is not going to force any GC anyways. Hence it is moot.

        Show
        Avinash Lakshman added a comment - No. You cannot free up memory. It will be get garbage collected once they are no longer actively referenced which will be the case. Setting it to NULL (which is what the clear() does) is not going to force any GC anyways. Hence it is moot.
        Hide
        Jonathan Ellis added a comment -

        Committed in r758965 - r758983

        Show
        Jonathan Ellis added a comment - Committed in r758965 - r758983
        Hide
        Hudson added a comment -

        Integrated in Cassandra #571 (See https://hudson.apache.org/hudson/job/Cassandra/571/)
        fix drop race with flush. patch by gdusbabek, reviewed by jbellis. CASSANDRA-1631
        fix drop race with compaction. patch by gdusbabek, reviewed by jbellis. CASSANDRA-1631
        add cli sanity tests.
        patch by Pavel Yaskevich; reviewed by jbellis for CASSANDRA-1582
        Deprecate RenameColumnFamily and RenameKeyspace. patch by gdusbabek, reviewed by jbellis. CASSANDRA-1
        630
        disable system_renam-ing in the cli. patch by gdusbabek, reviewed by jbellis. CASSANDRA-1
        630
        remove system_rename* methods from API. patch by gdusbabek, reviewed by jbellis. CASSANDRA-1
        630
        remvove system_rename* methods from API. thift/avro changes. patch by gdusbabek, reviewed by jbellis. CASSANDRA-1630

        Show
        Hudson added a comment - Integrated in Cassandra #571 (See https://hudson.apache.org/hudson/job/Cassandra/571/ ) fix drop race with flush. patch by gdusbabek, reviewed by jbellis. CASSANDRA-1631 fix drop race with compaction. patch by gdusbabek, reviewed by jbellis. CASSANDRA-1631 add cli sanity tests. patch by Pavel Yaskevich; reviewed by jbellis for CASSANDRA-1582 Deprecate RenameColumnFamily and RenameKeyspace. patch by gdusbabek, reviewed by jbellis. CASSANDRA-1 630 disable system_renam-ing in the cli. patch by gdusbabek, reviewed by jbellis. CASSANDRA-1 630 remove system_rename* methods from API. patch by gdusbabek, reviewed by jbellis. CASSANDRA-1 630 remvove system_rename* methods from API. thift/avro changes. patch by gdusbabek, reviewed by jbellis. CASSANDRA-1630

          People

          • Assignee:
            Jonathan Ellis
            Reporter:
            Jonathan Ellis
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development