Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-16860

Add --older-than option to nodetool clearsnapshot

    XMLWordPrintableJSON

    Details

    • Change Category:
      Operability
    • Complexity:
      Normal
    • Platform:
      All
    • Impacts:
      None
    • Test and Documentation Plan:
      Hide

      add tests

      Show
      add tests

      Description

      Summary

      Opening this issue in reference to this WIP PR:

      This functionality allows users of Cassandra to remove snapshots ad-hoc, based on a TTL. This is to address the problem of snapshots accumulating. For example, an organization I work for aims to keep snapshots for 30 days, however we don't have any way to easily clean them after those 30 days are up.

      This is similar to the goals set in: https://issues.apache.org/jira/browse/CASSANDRA-16451 however would be available for Cassandra 3.x.

      Functionality

      This adds a new command to NodeTool, called expiresnapshot with the following options:

      NAME
      nodetool expiresnapshots - Removes snapshots that are older than a TTL
      in days

      SYNOPSIS
      nodetool [(-h <host> | --host <host>)] [(-p <port> | --port <port>)]
      [(-pw <password> | --password <password>)]
      [(-pwf <passwordFilePath> | --password-file <passwordFilePath>)]
      [(-u <username> | --username <username>)] expiresnapshots [--dry-run]
      (-t <ttl> | --ttl <ttl>)

      OPTIONS
      --dry-run
      Run without actually clearing snapshots

      -h <host>, --host <host>
      Node hostname or ip address

      -p <port>, --port <port>
      Remote jmx agent port number

      -pw <password>, --password <password>
      Remote jmx agent password

      -pwf <passwordFilePath>, --password-file <passwordFilePath>
      Path to the JMX password file

      -t <ttl>, --ttl <ttl>
      TTL (in days) to expire snapshots

      -u <username>, --username <username>
      Remote jmx agent username

      The snapshot date is taken by converting the default snapshot name timestamps (epoch time in miliseconds). For this reason, snapshot names that don't contain a timestamp in this format will not be cleared.

      Example Use

      This Cassandra environment has a number of snapshots, a few are recent, and a few outdated:

      root@cassandra001:/cassandra# nodetool listsnapshots
      Snapshot Details:
      Snapshot name Keyspace name Column family name True size Size on disk
      1529173922063 users_keyspace users 362.03 KiB 362.89 KiB
      1629173909461 users_keyspace users 362.03 KiB 362.89 KiB
      1629173922063 users_keyspace users 362.03 KiB 362.89 KiB
      1599173922063 users_keyspace users 362.03 KiB 362.89 KiB
      1629173916816 users_keyspace users 362.03 KiB 362.89 KiB

      Total TrueDiskSpaceUsed: 1.77 MiB

      To validate the removal runs as expected, we can use the `--dry-run` option:

      root@cassandra001:/cassandra# nodetool expiresnapshots --ttl 30 --dry-run
      Starting simulated cleanup of snapshots older than 30 days
      Clearing (dry run): 1529173922063
      Clearing (dry run): 1599173922063
      Cleared (dry run): 2 snapshots

      Now that we are confident the correct snapshots will be removed, we can omit the --dry-run flag:

      root@cassandra001:/cassandra# nodetool expiresnapshots --ttl 30
      Starting cleanup of snapshots older than 30 days
      Clearing: 1529173922063
      Clearing: 1599173922063
      Cleared: 2 snapshots

      To confirm our changes are successful, we list the snapshots that still remain:

      root@cassandra001:/cassandra# nodetool listsnapshots
      Snapshot Details:
      Snapshot name Keyspace name Column family name True size Size on disk
      1629173909461 users_keyspace users 362.03 KiB 362.89 KiB
      1629173922063 users_keyspace users 362.03 KiB 362.89 KiB
      1629173916816 users_keyspace users 362.03 KiB 362.89 KiB

      Total TrueDiskSpaceUsed: 1.06 MiB

      Next Steps

      To be completed:

      • Tests
      • Documentation updates

      I am a new to this repository, and am fuzzy on a few details even after reading the contribution guide 😅 Any advice on the following would be greatly appreciated!

      • What branch would this type of change be merged into? Currently, I'm targeting apache:trunk by default
      • Is there a test strategy/pattern for this type of change? I was not able to find any existing tests for similar nodetool commands

      Thanks! 😄

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                jackcasey-visier Jack Casey
                Reporter:
                jackcasey-visier Jack Casey
                Authors:
                Jack Casey
                Reviewers:
                Paulo Motta
                Mentor:
                Paulo Motta
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated: