Details

    • Type: New Feature New Feature
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Fix Version/s: 0.4
    • Component/s: Core
    • Labels:
      None

      Description

      searching for "snapshot" in *.java shows a bunch of code for supporting snapshots via hard links.

      (this works b/c SSTables are immutable, once created.)

      this used to be more complete but when we dropped the JDK7 requirement we just removed the code that we couldn't do in JDK6 and hard link support was one of those.

      So what you would need to do here is:

      • create a hard link method (using Runtime.exec("ln") on linux / os x I imagine)
      • add a JMX hook to invoke this on the data files (this is where looking at the old codebase might help); ColumnFamilyStoreMBean.forceFlush is an example of an "Action" jmx interface. using jconsole to interact with JMX stuff is explained here: http://wiki.apache.org/cassandra/MemtableThresholds
      • add something to list the snapshots available via JMX
      • optionally make this all per-Table instead of per-database
      1. 279-4-2.patch
        6 kB
        Jonathan Ellis
      2. 0004-Patch-for-Cassandra-279-4th.patch
        15 kB
        Sammy Yu
      3. 0003-Patch-for-Cassandra-279-3rd.patch
        15 kB
        Sammy Yu
      4. 279-3.patch
        7 kB
        Jonathan Ellis
      5. 0002-Work-for-CASSANDRA-279.patch
        12 kB
        Sammy Yu
      6. 0001-Work-for-CASSANDRA-279.patch
        12 kB
        Sammy Yu
      7. 0001-Work-for-CASSANDRA-279.patch
        11 kB
        Sammy Yu

        Activity

        Hide
        Hudson added a comment -

        Integrated in Cassandra #151 (See http://hudson.zones.apache.org/hudson/job/Cassandra/151/)
        finish snapshot support. patch by Sammy Yu; reviewed by jbellis and Michael Greene for

        Show
        Hudson added a comment - Integrated in Cassandra #151 (See http://hudson.zones.apache.org/hudson/job/Cassandra/151/ ) finish snapshot support. patch by Sammy Yu; reviewed by jbellis and Michael Greene for
        Hide
        Jonathan Ellis added a comment -

        committed

        Show
        Jonathan Ellis added a comment - committed
        Hide
        Sammy Yu added a comment -

        Thanks Jonathan, the patch looks good!

        Show
        Sammy Yu added a comment - Thanks Jonathan, the patch looks good!
        Hide
        Jonathan Ellis added a comment -

        Patch on top of sammy's latest. Encapsulates snapshot subdir better and adds waitfor to the hardlink process. Look ok?

        Show
        Jonathan Ellis added a comment - Patch on top of sammy's latest. Encapsulates snapshot subdir better and adds waitfor to the hardlink process. Look ok?
        Hide
        Sammy Yu added a comment -

        Incorporated Jonathon and Michael's comments

        This is a self-contained patch

        Show
        Sammy Yu added a comment - Incorporated Jonathon and Michael's comments This is a self-contained patch
        Hide
        Michael Greene added a comment -

        Agreed with jbellis re: directory hierarchy.
        In the usage, should say "snapshot [name]" so the parameter is discoverable.

        Both commands work correctly over here, on both Ubuntu 9.04 and Win2008.

        Show
        Michael Greene added a comment - Agreed with jbellis re: directory hierarchy. In the usage, should say "snapshot [name] " so the parameter is discoverable. Both commands work correctly over here, on both Ubuntu 9.04 and Win2008.
        Hide
        Jonathan Ellis added a comment -

        we already have

        <data dir>/<table>

        heirarchy for the normal table data, let's put snapshots under that instead of having a separate tree with its own per-table area

        <data dir>/<table>/<snapshots>/<snapshot tag>/<sstable files>

        Show
        Jonathan Ellis added a comment - we already have <data dir>/<table> heirarchy for the normal table data, let's put snapshots under that instead of having a separate tree with its own per-table area <data dir>/<table>/<snapshots>/<snapshot tag>/<sstable files>
        Hide
        Sammy Yu added a comment -

        snapshots directory are now created under the data directory so the structure looks like this:
        <data directory>/snapshots/<snapshotname>/<table>/<sstable>
        added support for clearsnapshot which will wipe the snapshots directory under each data directory

        Show
        Sammy Yu added a comment - snapshots directory are now created under the data directory so the structure looks like this: <data directory>/snapshots/<snapshotname>/<table>/<sstable> added support for clearsnapshot which will wipe the snapshots directory under each data directory
        Hide
        Jonathan Ellis added a comment -

        Re Michael's comments,

        > Shouldn't snapshot directories be a list since it mirrors data file directories?

        I didn't understand what Michael meant at first, but yes, we need one snapshot directory per data directory since the reason to have multiple data dirs is to have one per disk, and you can't create hard links across disks.

        So really what we want to do is drop the snapshotdir configuration parameter and just create a snapshots/ subdir for each data dir.

        So to reassemble a snapshot you will have to look in each snapshots/ to see if there are pieces there but that is unavoidable I think. (And not a Big Deal with a little scripting.)

        > Is there no way to expose the tag parameter to nodeprobe?

        +1

        > Also, if the output stream isn't going to be used it should probably not be passed to takeSnapshot.

        +1

        Show
        Jonathan Ellis added a comment - Re Michael's comments, > Shouldn't snapshot directories be a list since it mirrors data file directories? I didn't understand what Michael meant at first, but yes, we need one snapshot directory per data directory since the reason to have multiple data dirs is to have one per disk, and you can't create hard links across disks. So really what we want to do is drop the snapshotdir configuration parameter and just create a snapshots/ subdir for each data dir. So to reassemble a snapshot you will have to look in each snapshots/ to see if there are pieces there but that is unavoidable I think. (And not a Big Deal with a little scripting.) > Is there no way to expose the tag parameter to nodeprobe? +1 > Also, if the output stream isn't going to be used it should probably not be passed to takeSnapshot. +1
        Hide
        Jonathan Ellis added a comment -

        Here is a patch on top of sammy's latest that fixes cosmetic issues, removes the unused timestamp variable to snapshot, and adds a wait() to the hardlink processubuilder.

        Show
        Jonathan Ellis added a comment - Here is a patch on top of sammy's latest that fixes cosmetic issues, removes the unused timestamp variable to snapshot, and adds a wait() to the hardlink processubuilder.
        Hide
        Sammy Yu added a comment -

        Ignore the other two patches, this one is the cumulative one.

        Show
        Sammy Yu added a comment - Ignore the other two patches, this one is the cumulative one.
        Hide
        Michael Greene added a comment - - edited

        This won't apply to trunk. Can you rebase?

        Shouldn't snapshot directories be a list since it mirrors data file directories?
        Is there no way to expose the tag parameter to nodeprobe? Also, if the output stream isn't going to be used it should probably not be passed to takeSnapshot.

        Show
        Michael Greene added a comment - - edited This won't apply to trunk. Can you rebase? Shouldn't snapshot directories be a list since it mirrors data file directories? Is there no way to expose the tag parameter to nodeprobe? Also, if the output stream isn't going to be used it should probably not be passed to takeSnapshot.
        Hide
        Sammy Yu added a comment -
        • Made Snapshot directory mandatory in the config file
        • Incorporated jbellis' comments
        • Added support for both Windows 6.0 and higher kernel and older version as well.
        Show
        Sammy Yu added a comment - Made Snapshot directory mandatory in the config file Incorporated jbellis' comments Added support for both Windows 6.0 and higher kernel and older version as well.
        Hide
        Sammy Yu added a comment -

        I will add support for Windows NT 6.0and later based kernel otherwise it will throw an IOException on other older Windows based systems. Unless there is a big demand maybe we can bundle something like junction.exe barring any licensing issue.

        Show
        Sammy Yu added a comment - I will add support for Windows NT 6.0and later based kernel otherwise it will throw an IOException on other older Windows based systems. Unless there is a big demand maybe we can bundle something like junction.exe barring any licensing issue.
        Hide
        Michael Greene added a comment -

        I'm definitely going to need this to play nicely with Windows (if not support, then at least fail gracefully) and can test (or write before committed) the functionality if Sammy can't.

        What's the intended use of these snapshots? Restoration at a later date? Is that scoped to Cassandra or nodeprobe, or would that be a manual process?

        Show
        Michael Greene added a comment - I'm definitely going to need this to play nicely with Windows (if not support, then at least fail gracefully) and can test (or write before committed) the functionality if Sammy can't. What's the intended use of these snapshots? Restoration at a later date? Is that scoped to Cassandra or nodeprobe, or would that be a manual process?
        Hide
        Jonathan Ellis added a comment -

        thanks for the patch!

        i like the optional user-provided tag; good idea there. I also didn't know about ProcessBuilder – learn something every day.

        some minor changes. sorry for the long-ish list, most are really quick ones:

        • use fooPath for strings that represent fs paths. so you'd have String snapshotPath and File snapshotDir. less confusing than Dir/Directory.
        • you aren't modifying the sstables_ map so you only need to take a read lock, even though you're writing to disk
        • prefer File.separator to System.getProperty("file.separator"). prefer new File(parent, child) to parent + separator + child when a File is needed.
        • throw new IOException("Snapshot directory must be set."); – let's just force this to be set at config time rather than throw an error during the snapshot process that we can't propagate up to the user easily. if we also mkdir it at config time that is one less check we need to do, too.
        • snapshotDirectory, snapshotDir, currentSnapshotDir, snapshotTagDirectory – can we cut down on the local vars here at all? would it help to point out FileUtils.createDirectory which can create multiple subdirs at once, as necessary?

        + if (clientSuppliedName != null && !clientSuppliedName.equals("")) {^M
        + currentSnapshotDir = snapshotDirectory + System.getProperty("file.separator") + System.currentTimeMillis(
        + }^M
        + else^M
        + {^M
        + currentSnapshotDir = snapshotDirectory + System.getProperty("file.separator") + System.currentTimeMillis(
        + }^M

        here it's better to build the common part of the var, then use one iff to add the tag if necessary. this emphasizes the important part of the If and makes it easier to discern what the conditional is for.

        • normally you can use braces or not in one-liners as you please but for isDebugEnabled, always leave them off; it's too much noise for a debug statment
        • remember to set your ide to indent w/ 4 spaces, not tabs
        • windows support? "On Microsoft Windows, hard links can be created using the mklink /H command on Windows NT 6.0 and later systems (such as Windows Vista), and in earlier systems (Windows 2000, XP) using fsutil hardlink create." if it's a big deal we can leave this to someone who actually needs the feature but it seems that if you have Windows available to test on it's not hard.
        Show
        Jonathan Ellis added a comment - thanks for the patch! i like the optional user-provided tag; good idea there. I also didn't know about ProcessBuilder – learn something every day. some minor changes. sorry for the long-ish list, most are really quick ones: use fooPath for strings that represent fs paths. so you'd have String snapshotPath and File snapshotDir. less confusing than Dir/Directory. you aren't modifying the sstables_ map so you only need to take a read lock, even though you're writing to disk prefer File.separator to System.getProperty("file.separator"). prefer new File(parent, child) to parent + separator + child when a File is needed. throw new IOException("Snapshot directory must be set."); – let's just force this to be set at config time rather than throw an error during the snapshot process that we can't propagate up to the user easily. if we also mkdir it at config time that is one less check we need to do, too. snapshotDirectory, snapshotDir, currentSnapshotDir, snapshotTagDirectory – can we cut down on the local vars here at all? would it help to point out FileUtils.createDirectory which can create multiple subdirs at once, as necessary? + if (clientSuppliedName != null && !clientSuppliedName.equals("")) {^M + currentSnapshotDir = snapshotDirectory + System.getProperty("file.separator") + System.currentTimeMillis( + }^M + else^M + {^M + currentSnapshotDir = snapshotDirectory + System.getProperty("file.separator") + System.currentTimeMillis( + }^M here it's better to build the common part of the var, then use one iff to add the tag if necessary. this emphasizes the important part of the If and makes it easier to discern what the conditional is for. normally you can use braces or not in one-liners as you please but for isDebugEnabled, always leave them off; it's too much noise for a debug statment remember to set your ide to indent w/ 4 spaces, not tabs windows support? "On Microsoft Windows, hard links can be created using the mklink /H command on Windows NT 6.0 and later systems (such as Windows Vista), and in earlier systems (Windows 2000, XP) using fsutil hardlink create." if it's a big deal we can leave this to someone who actually needs the feature but it seems that if you have Windows available to test on it's not hard.
        Hide
        Sammy Yu added a comment -

        Based on avinash original Java 7 version (commit 21511d028741de80d02ea648cb76be7aa67bffc2)
        Added FileUtils.createHardLink
        Modified ColumnFamilyStore to support taking snapshot at column family level
        Modified Table to support taking snapshot.
        Modified StorageServiceMBean to support taking snapshots of all tables or individual table with optional tag.
        Updated nodeprobe to include snapshot command which takes a snapshot for all the tables.

        Show
        Sammy Yu added a comment - Based on avinash original Java 7 version (commit 21511d028741de80d02ea648cb76be7aa67bffc2) Added FileUtils.createHardLink Modified ColumnFamilyStore to support taking snapshot at column family level Modified Table to support taking snapshot. Modified StorageServiceMBean to support taking snapshots of all tables or individual table with optional tag. Updated nodeprobe to include snapshot command which takes a snapshot for all the tables.
        Hide
        Jonathan Ellis added a comment -

        r756666 - r756758 is where avinash took out the jdk7 dependencies

        Show
        Jonathan Ellis added a comment - r756666 - r756758 is where avinash took out the jdk7 dependencies

          People

          • Assignee:
            Sammy Yu
            Reporter:
            Jonathan Ellis
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development