Uploaded image for project: 'Apache Cassandra'
  1. Apache Cassandra
  2. CASSANDRA-11721

Have a per operation truncate ddl "no snapshot" option

    XMLWordPrintableJSON

Details

    Description

      Right now with truncate, it will always create a snapshot. That is the right thing to do most of the time. 'auto_snapshot' exists as an option to disable that but it is server wide and requires a restart to change. There are data models, however, that require rotating through a handful of tables and periodically truncating them. Currently you either have to operate with no safety net (some actually do this) or manually clear those snapshots out periodically. Both are less than optimal.

      In HDFS, you generally delete something where it goes to the trash. If you don't want that safety net, you can do something like 'rm -rf -skiptrash /jeremy/stuff' in one command.

      It would be nice to have something in the truncate ddl to skip the snapshot on a per operation basis. Perhaps 'TRUNCATE solarsystem.earth NO SNAPSHOT'.

      This might also be useful in those situations where you're just playing with data and you don't want something to take a snapshot in a development system. If that's the case, this would also be useful for the DROP operation, but that convenience is not the main reason for this option.

      Additional information for newcomers:

      This test is a bit more complex that normal LHF tickets but is still reasonably easy.

      The idea is to support disabling snapshots when performing a Truncate as follow:

      TRUNCATE x WITH OPTIONS = { 'snapshot' : false }

      In order to implement that feature several changes are required:

      • A new Class TruncateAttributes inheriting from PropertyDefinitions must be create in a similar way to KeyspaceAttributes or TableAttributes
      • This class should be passed to the TruncateStatement constructor and stored as a field
      • The ANTLR parser logic should be change to retrieve the options and passe them to the constructor (see createKeyspaceStatement for an example)
      • The TruncateStatement will then need to be modified to take into account the new option. Locally it will neeed to call ColumnFamilyStore#truncateBlockingWithoutSnapshot if no snapshot should be done instead of ColumnFamilyStore#truncateBlocking. For non local call it will need to pass a new parameter to StorageProxy#truncateBloking. That parameter will then need to be passed to the other nodes through the TruncateRequest.
      • As a new field need to be added to TruncateRequest this field will need to be serialized and deserialized and a new MessagingService.Version will need to be created and set as the current version the new version should be 50 (and yes it means that the next release will be a major one 5.0)
      • In TruncateVerbHandler the new field should be used to determine if ColumnFamilyStore#truncateBlockingWithoutSnapshot or ColumnFamilyStore#truncateBlocking should be called.
      • An in-jvm test should be added in test/distributed/org/apache/cassandra/distributed/test to test that truncate does not generate snapshots when the new option is specified.
        Do not hesitate to ping the mentor for more information.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              jeromatron Jeremy Hanna
              Votes:
              1 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: