Details

    • Type: Sub-task Sub-task
    • Status: Resolved
    • Priority: Minor Minor
    • Resolution: Fixed
    • Fix Version/s: 0.7 beta 1
    • Component/s: Core
    • Labels:
      None

      Description

      Create an additional system keyspace to store KS definitions and migration data.
      Tweak the internals so cassandra can optional load all definitions from there, ignoring CF definitions in the xml.

        Activity

        Hide
        Hudson added a comment -

        Integrated in Cassandra #373 (See http://hudson.zones.apache.org/hudson/job/Cassandra/373/)
        use jug for TimeUUID generation

        Show
        Hudson added a comment - Integrated in Cassandra #373 (See http://hudson.zones.apache.org/hudson/job/Cassandra/373/ ) use jug for TimeUUID generation
        Hide
        Jonathan Ellis added a comment -

        +1

        Show
        Jonathan Ellis added a comment - +1
        Hide
        Gary Dusbabek added a comment -

        Removed unnecessary locking. fixMaxId() doesn't need to be called until we are loading from storage. When loading from XML, the counter starts at 0 and works its way up (it's reset every restart).

        Show
        Gary Dusbabek added a comment - Removed unnecessary locking. fixMaxId() doesn't need to be called until we are loading from storage. When loading from XML, the counter starts at 0 and works its way up (it's reset every restart).
        Hide
        Jonathan Ellis added a comment -

        fixMaxId is never called?

        wrapping an AtomicInteger w/ synchronized isn't the end of the world but it's kind of ugly.

        Maybe using a wrapper class like in CommitLog to avoid explicit synchronization but still getting laziness would work here instead?

        Show
        Jonathan Ellis added a comment - fixMaxId is never called? wrapping an AtomicInteger w/ synchronized isn't the end of the world but it's kind of ugly. Maybe using a wrapper class like in CommitLog to avoid explicit synchronization but still getting laziness would work here instead?
        Hide
        Gary Dusbabek added a comment -

        Correct.

        Show
        Gary Dusbabek added a comment - Correct.
        Hide
        Jonathan Ellis added a comment -

        So for 825, the plan is drop patch 5, make the other tweaks I mentioned, and then make the CLH changes as part of 840?

        Show
        Jonathan Ellis added a comment - So for 825, the plan is drop patch 5, make the other tweaks I mentioned, and then make the CLH changes as part of 840?
        Hide
        Gary Dusbabek added a comment -

        > no longer function of # of cfs – ?
        I guess that no longer matters. At some point I was relying on the fact that I could know the size of the CLH given the number of CFs. But I got rid of that at some point.

        Show
        Gary Dusbabek added a comment - > no longer function of # of cfs – ? I guess that no longer matters. At some point I was relying on the fact that I could know the size of the CLH given the number of CFs. But I got rid of that at some point.
        Hide
        Jonathan Ellis added a comment - - edited

        size needs to be fixed [per CLH] – right

        no longer function of # of cfs – ?

        Show
        Jonathan Ellis added a comment - - edited size needs to be fixed [per CLH] – right no longer function of # of cfs – ?
        Hide
        Gary Dusbabek added a comment -

        No, I meant RM, but now that I think of it, CLH is a better place. The only down-side that it needs the entire mapping because its size needs to be fixed and its size no longer becomes a function of the number of cfs. But I think that's manageable.

        Show
        Gary Dusbabek added a comment - No, I meant RM, but now that I think of it, CLH is a better place. The only down-side that it needs the entire mapping because its size needs to be fixed and its size no longer becomes a function of the number of cfs. But I think that's manageable.
        Hide
        Jonathan Ellis added a comment -

        You're right, I was thinking CLH had an id -> name map already.

        We definitely don't want to put an entire name/id map in RM though, I assume you meant CLH?

        Show
        Jonathan Ellis added a comment - You're right, I was thinking CLH had an id -> name map already. We definitely don't want to put an entire name/id map in RM though, I assume you meant CLH?
        Hide
        Gary Dusbabek added a comment -

        CLH maps ids to dirty bits and flush points-no names. If we modified RM to contain a name>id map, we'd be in good shape and could get rid of the historical names.

        What I've done (and don't care for) in 840 is modify the CF during recovery if it is detected that the name has changed (can't find an id). This is error-prone because if a user has a habit of recycling cfnames over time, the reverse mapping can become ambiguous.

        I'll make a change so that the cfid can be had from the RM as opposed to table.

        Show
        Gary Dusbabek added a comment - CLH maps ids to dirty bits and flush points- no names. If we modified RM to contain a name >id map, we'd be in good shape and could get rid of the historical names. What I've done (and don't care for) in 840 is modify the CF during recovery if it is detected that the name has changed (can't find an id). This is error-prone because if a user has a habit of recycling cfnames over time, the reverse mapping can become ambiguous. I'll make a change so that the cfid can be had from the RM as opposed to table.
        Hide
        Jonathan Ellis added a comment -

        each segment has a header w/ a name -> id map [currently you have id -> name but just make it a BiMap] so you use that to get the id for the RM cfname.

        either way we need to change Table.apply to take a Map<name, id> representing what the mapping was at the time of the RM so it can put it in the right CFS [not necessarily in this patchset]. I'm saying it's simpler to just use the one you already have in the CLH than build one up laboriously from binary-searched timestams.

        Show
        Jonathan Ellis added a comment - each segment has a header w/ a name -> id map [currently you have id -> name but just make it a BiMap] so you use that to get the id for the RM cfname. either way we need to change Table.apply to take a Map<name, id> representing what the mapping was at the time of the RM so it can put it in the right CFS [not necessarily in this patchset] . I'm saying it's simpler to just use the one you already have in the CLH than build one up laboriously from binary-searched timestams.
        Hide
        Gary Dusbabek added a comment -

        Right, and that's the approach I've taken in CASSANDRA-840. But what of old segments still laying around that may need to be used to recover after a restart? They still refer to CF names that no longer a rename.

        Show
        Gary Dusbabek added a comment - Right, and that's the approach I've taken in CASSANDRA-840 . But what of old segments still laying around that may need to be used to recover after a restart? They still refer to CF names that no longer a rename.
        Hide
        Jonathan Ellis added a comment -

        flushing is a lot of work, but creating a new segment (w/ new header) is not.

        Show
        Jonathan Ellis added a comment - flushing is a lot of work, but creating a new segment (w/ new header) is not.
        Hide
        Gary Dusbabek added a comment -

        > - for patch 5, why do we need historical names?
        I might be missing something, but the cfname is one of the serialized properties of RowMutation that ends up in the commit log. Since it would be a lot of work to flush the commit log prior to a rename, we need away to get from the old cfnames to the current id. It doesn't have anything to do with old sstables.

        Show
        Gary Dusbabek added a comment - > - for patch 5, why do we need historical names? I might be missing something, but the cfname is one of the serialized properties of RowMutation that ends up in the commit log. Since it would be a lot of work to flush the commit log prior to a rename, we need away to get from the old cfnames to the current id. It doesn't have anything to do with old sstables.
        Hide
        Jonathan Ellis added a comment -
        • Can you use less words in parameter / field names, but spell them out?

        repStratClass -> strategyClass
        epSnitch -> snitch
        etc.

        (for local variable names KSMetaData ksm etc. is fine)

        • Can you add a FBUtilities convenience for this?
          UUIDGen.makeType1UUID(UUIDGen.fromHost(FBUtilities.getLocalAddress()))
        • This probably isn't a good mac address replacment – since it's iterating most-significant-first, the chance of collision w/in a cluster is basically guaranteed. Which probably isn't the end of the world here but it's not too much more work to run getAddress through md5 first.
          + for (int i = 0; i < enet.length && i < addr.length; i++)
          + enet[i] = addr[i];
        • this is clunky and possibly error-prone (since if we are "jumping" the counter higher when we load a new id? then we might have been generating "new" ids that collided with yet-to-be-read ones)
          + synchronized (idGen)
          + { + if (idGen.get() <= cfId) + idGen.set(cfId + 1); + }

        Can we just initialize the counter to the max value at initialization time?

        • for patch 5, why do we need historical names? we don't need it for log replay (only the current name gets turned into an sstable) and if we're going to use it to read old sstables based on timestamp, that's a mess, we'd need to record not only the name sequence but how long each name was mapped to the id. Wouldn't it be simpler to name sstable files by cf ID instead of name? [for old installs a one-time conversion on startup can be done, rename is fast so it's ok to do that rather than deal w/ complexity of mixed ids and names, imo]. Or is this for something else entirely?
        Show
        Jonathan Ellis added a comment - Can you use less words in parameter / field names, but spell them out? repStratClass -> strategyClass epSnitch -> snitch etc. (for local variable names KSMetaData ksm etc. is fine) Can you add a FBUtilities convenience for this? UUIDGen.makeType1UUID(UUIDGen.fromHost(FBUtilities.getLocalAddress())) This probably isn't a good mac address replacment – since it's iterating most-significant-first, the chance of collision w/in a cluster is basically guaranteed. Which probably isn't the end of the world here but it's not too much more work to run getAddress through md5 first. + for (int i = 0; i < enet.length && i < addr.length; i++) + enet [i] = addr [i] ; this is clunky and possibly error-prone (since if we are "jumping" the counter higher when we load a new id? then we might have been generating "new" ids that collided with yet-to-be-read ones) + synchronized (idGen) + { + if (idGen.get() <= cfId) + idGen.set(cfId + 1); + } Can we just initialize the counter to the max value at initialization time? for patch 5, why do we need historical names? we don't need it for log replay (only the current name gets turned into an sstable) and if we're going to use it to read old sstables based on timestamp, that's a mess, we'd need to record not only the name sequence but how long each name was mapped to the id. Wouldn't it be simpler to name sstable files by cf ID instead of name? [for old installs a one-time conversion on startup can be done, rename is fast so it's ok to do that rather than deal w/ complexity of mixed ids and names, imo] . Or is this for something else entirely?
        Hide
        Gary Dusbabek added a comment -

        yeah. I'll clean that up.

        Show
        Gary Dusbabek added a comment - yeah. I'll clean that up.
        Hide
        Jonathan Ellis added a comment -
        • public final double keyCacheSize; // default 0.01
          + public final double keyCacheFraction; // default 0.01

        bad rebase?

        Show
        Jonathan Ellis added a comment - public final double keyCacheSize; // default 0.01 + public final double keyCacheFraction; // default 0.01 bad rebase?
        Hide
        Jonathan Ellis added a comment -

        You can pass the generator an "EthernetAddress" containing whatever bytes you want: http://jug.safehaus.org/curr/javadoc/org/safehaus/uuid/UUIDGenerator.html

        Show
        Jonathan Ellis added a comment - You can pass the generator an "EthernetAddress" containing whatever bytes you want: http://jug.safehaus.org/curr/javadoc/org/safehaus/uuid/UUIDGenerator.html
        Hide
        Gary Dusbabek added a comment -

        I looked at jug briefly but stopped when I saw native libraries. If there were a way to use it and extend it so that native bits aren't needed, then I'm +1. I prefer to avoid introducing native libs if we can avoid it, especially for something this trivial. I like the 'put something else unique in those bytes' approach.

        2-4 modify introduces DefsTable (similar to SystemTable) as a way to modify storing and loading the data definitions to the new definitions keyspace.
        5 makes it so that cfid generation is safe across restarts and can happen at times other than startup.
        6 gives support for tracking historical CF names. In the case of cf renames, the id doesn't change, but the name does and we need a way to get back to the id when recovering.

        The basic idea of the set is that we need to add support for reading data definitions from storage and not xml.

        Show
        Gary Dusbabek added a comment - I looked at jug briefly but stopped when I saw native libraries. If there were a way to use it and extend it so that native bits aren't needed, then I'm +1. I prefer to avoid introducing native libs if we can avoid it, especially for something this trivial. I like the 'put something else unique in those bytes' approach. 2-4 modify introduces DefsTable (similar to SystemTable) as a way to modify storing and loading the data definitions to the new definitions keyspace. 5 makes it so that cfid generation is safe across restarts and can happen at times other than startup. 6 gives support for tracking historical CF names. In the case of cf renames, the id doesn't change, but the name does and we need a way to get back to the id when recovering. The basic idea of the set is that we need to add support for reading data definitions from storage and not xml.
        Hide
        Jonathan Ellis added a comment -

        Can you give a high-level overview of the the remaining patches are doing?

        Show
        Jonathan Ellis added a comment - Can you give a high-level overview of the the remaining patches are doing?
        Hide
        Jonathan Ellis added a comment -

        JUG also provides a native library for getting the mac address. I'm -1 on putting mac in the config since it completely screws the ability to run a cluster from a common file. If we can't get the mac from native code on platform X, or if it's painful, then let's make something up and use it as the mac bytes. For our purposes one pseudo-unique set of bytes in the MAC positions is as good as another.

        Show
        Jonathan Ellis added a comment - JUG also provides a native library for getting the mac address. I'm -1 on putting mac in the config since it completely screws the ability to run a cluster from a common file. If we can't get the mac from native code on platform X, or if it's painful, then let's make something up and use it as the mac bytes. For our purposes one pseudo-unique set of bytes in the MAC positions is as good as another.
        Hide
        Jonathan Ellis added a comment -

        Instead of rolling our own uuid generator can we just use http://jug.safehaus.org/ ? It appears to be available in maven repos so Ivy should be cool.

        Show
        Jonathan Ellis added a comment - Instead of rolling our own uuid generator can we just use http://jug.safehaus.org/ ? It appears to be available in maven repos so Ivy should be cool.

          People

          • Assignee:
            Gary Dusbabek
            Reporter:
            Gary Dusbabek
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development