Details

    • Type: Sub-task Sub-task
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Fix Version/s: 0.7 beta 1
    • Component/s: None
    • Labels:
      None

      Description

      Add hooks that recognize when system KS is updated.
      Figure out the best way to refresh the collections in DD, perform the adds, deletes, etc.

        Activity

        Gary Dusbabek created issue -
        Hide
        Gary Dusbabek added a comment -

        Before I go down this path, I want to make sure I have the thrift changes basically right.

        Show
        Gary Dusbabek added a comment - Before I go down this path, I want to make sure I have the thrift changes basically right.
        Gary Dusbabek made changes -
        Field Original Value New Value
        Attachment thrift-changes.patch [ 12439457 ]
        Hide
        Jonathan Ellis added a comment -

        void add_column_family_definition(1:required CfDef cf_def)

        shouldn't this take a KS arg too?

        Show
        Jonathan Ellis added a comment - void add_column_family_definition(1:required CfDef cf_def) shouldn't this take a KS arg too?
        Hide
        Jonathan Ellis added a comment -

        other things to think about w/ this issue:

        how do we reconcile this with the "login to keyspace before doing stuff" authentication change? should we require logging in to the system keyspace, before doing allowing schema modifications?

        should we prefix these methods with system_ or something to denote "not normal data modification stuff?"

        Show
        Jonathan Ellis added a comment - other things to think about w/ this issue: how do we reconcile this with the "login to keyspace before doing stuff" authentication change? should we require logging in to the system keyspace, before doing allowing schema modifications? should we prefix these methods with system_ or something to denote "not normal data modification stuff?"
        Hide
        Gary Dusbabek added a comment -

        >shouldn't this take a KS arg too?
        covered in cf_def.table.

        >should we require logging in to the system keyspace
        Yes, I think this is a good idea.

        >should we prefix these methods with system_ or something
        sure. I'll all 'system_' and drop '_definition' from the method names.

        Show
        Gary Dusbabek added a comment - >shouldn't this take a KS arg too? covered in cf_def.table. >should we require logging in to the system keyspace Yes, I think this is a good idea. >should we prefix these methods with system_ or something sure. I'll all 'system_' and drop '_definition' from the method names.
        Gary Dusbabek made changes -
        Attachment v1-0001-remove-unused-JoinMessage.txt [ 12439905 ]
        Attachment v1-0002-Refactor-so-meta-mutations-can-be-serialized-and-moved.txt [ 12439906 ]
        Attachment v1-0003-move-ConfigurationException-to-its-own-class-file.txt [ 12439907 ]
        Attachment v1-0004-Gossip-metadata-version-and-request-updates.txt [ 12439908 ]
        Attachment v1-0005-thrift-changes.txt [ 12439909 ]
        Attachment v1-0006-switch-to-reading-schema-configuration-from-storage.txt [ 12439910 ]
        Attachment v1-0007-thrift-impl.txt [ 12439911 ]
        Attachment v1-0008-use-cfid-for-all-mutations.txt [ 12439912 ]
        Gary Dusbabek made changes -
        Attachment thrift-changes.patch [ 12439457 ]
        Gary Dusbabek made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Fix Version/s 0.7 [ 12314533 ]
        Hide
        Gary Dusbabek added a comment - - edited

        0002 refactors DefsTable heavily. Each operation is broken out to its own class that can be serialized and moved to other nodes. It also introduces the concept of a 'migration,' which is a schema operation (add|drop|rename cf|ks).

        0004 changes gossip to that it includes the schema version. It brings back the DefsTable executor as a stage on which migrations are executed. A seed node is instructed to wait (same as load balance waiting period) so that it can contact other seed nodes for updated schemas before it begins serving requests.

        0006 changes startup to always load schema from the system tables. The load-from-xml code was kept around to help with transition and exposed in JMX. We can remove this in a release or two. The java unit tests still load schema from xml.

        0007 implements the stubbed thrift methods. The nosetests use them to establish a testing schema.

        0008 alters the write path so that the cfid is used to determine which CFS is written to during normal writes and CL recovery. CF names are still used (extensively--I didn't have the guts to push this through all the way) for other operations. I'm not happy with Table.cfNameMap but don't see an alternative that doesn't require touching a lot of code.

        Show
        Gary Dusbabek added a comment - - edited 0002 refactors DefsTable heavily. Each operation is broken out to its own class that can be serialized and moved to other nodes. It also introduces the concept of a 'migration,' which is a schema operation (add|drop|rename cf|ks). 0004 changes gossip to that it includes the schema version. It brings back the DefsTable executor as a stage on which migrations are executed. A seed node is instructed to wait (same as load balance waiting period) so that it can contact other seed nodes for updated schemas before it begins serving requests. 0006 changes startup to always load schema from the system tables. The load-from-xml code was kept around to help with transition and exposed in JMX. We can remove this in a release or two. The java unit tests still load schema from xml. 0007 implements the stubbed thrift methods. The nosetests use them to establish a testing schema. 0008 alters the write path so that the cfid is used to determine which CFS is written to during normal writes and CL recovery. CF names are still used (extensively--I didn't have the guts to push this through all the way) for other operations. I'm not happy with Table.cfNameMap but don't see an alternative that doesn't require touching a lot of code.
        Hide
        Jonathan Ellis added a comment -

        Looks pretty good to me!

        • 02: Can we provide a .serializer().de/serialize interface to the Migrations, instead of getBytes, for consistency w/ the other code? (Alternatively, we could move that stuff into a Thrift class; rolling that stuff by hand is something I'd like to move away from.)
        • 04: the requesting of CF definitions looks like it should be done in a gossip listener, not directly in Gossiper. we shouldn't have to modify Gossiper internals at all; the CF defs should go in a new applicationstate, and someone (KSMetaData?) should register an onChange listener that does the requesting of updates.

        "seed nodes should try to get schema updates from other seed nodes before going into normal mode" seems like complexity that doesn't really buy us much.

        I'm not sure what "don't bootstrap if there are no tables defined" gets us either.

        we should make a best effort to push the migration out to all live nodes, and let dead or partitioned ones catch up via the gossip method. relying entirely on gossip makes propagation pretty slow even when cluster is entirely healthy. (the "push" could just be a hint for "everyone request defns from me.") doing this in another ticket is fine imo.

        • 08: if RowMutation is still using strings instead of cfids, doesn't that leave us vulnerable to the bug discussed earlier where it gets applied to the wrong CF after a rename?
        Show
        Jonathan Ellis added a comment - Looks pretty good to me! 02: Can we provide a .serializer().de/serialize interface to the Migrations, instead of getBytes, for consistency w/ the other code? (Alternatively, we could move that stuff into a Thrift class; rolling that stuff by hand is something I'd like to move away from.) 04: the requesting of CF definitions looks like it should be done in a gossip listener, not directly in Gossiper. we shouldn't have to modify Gossiper internals at all; the CF defs should go in a new applicationstate, and someone (KSMetaData?) should register an onChange listener that does the requesting of updates. "seed nodes should try to get schema updates from other seed nodes before going into normal mode" seems like complexity that doesn't really buy us much. I'm not sure what "don't bootstrap if there are no tables defined" gets us either. we should make a best effort to push the migration out to all live nodes, and let dead or partitioned ones catch up via the gossip method. relying entirely on gossip makes propagation pretty slow even when cluster is entirely healthy. (the "push" could just be a hint for "everyone request defns from me.") doing this in another ticket is fine imo. 08: if RowMutation is still using strings instead of cfids, doesn't that leave us vulnerable to the bug discussed earlier where it gets applied to the wrong CF after a rename?
        Hide
        Gary Dusbabek added a comment -

        02: I'll pass on the thrift class stuff. I must be the only person who finds ICompactSerializer<T> overwrought.
        04: I'll move that part out of the gossiper.

        >"seed nodes should try to get schema updates from other seed nodes before going into normal mode"
        My intent was to ensure that seeds have updated schemas before they start doing their jobs in order for them to transmit that data to new nodes since the seeds are contacted first. If that comes out, the worst case is that it takes another gossip round or two before the new nodes fine someone with the most current schema. In retrospect, I think you're right--probably over complex.

        >"don't bootstrap if there are no tables defined"
        SS.finishBootstrapping() was only ever called via removeBootstrapSource(). And when there are no tables, there are no bootstrap sources to remove (streaming never finishes). Detecting this condition and avoiding the wait seemed like the best course to take.

        08: The critical parts were Table.apply and CL.recover. Both of those are handled. The use of QueryPath makes RowMutation somewhat vulnerable no matter what. There isn't a lot we can do there besides using cfids throughout.

        Show
        Gary Dusbabek added a comment - 02: I'll pass on the thrift class stuff. I must be the only person who finds ICompactSerializer<T> overwrought. 04: I'll move that part out of the gossiper. >"seed nodes should try to get schema updates from other seed nodes before going into normal mode" My intent was to ensure that seeds have updated schemas before they start doing their jobs in order for them to transmit that data to new nodes since the seeds are contacted first. If that comes out, the worst case is that it takes another gossip round or two before the new nodes fine someone with the most current schema. In retrospect, I think you're right--probably over complex. >"don't bootstrap if there are no tables defined" SS.finishBootstrapping() was only ever called via removeBootstrapSource(). And when there are no tables, there are no bootstrap sources to remove (streaming never finishes). Detecting this condition and avoiding the wait seemed like the best course to take. 08: The critical parts were Table.apply and CL.recover. Both of those are handled. The use of QueryPath makes RowMutation somewhat vulnerable no matter what. There isn't a lot we can do there besides using cfids throughout.
        Hide
        Jonathan Ellis added a comment -

        "Detecting this condition and avoiding the wait seemed like the best course to take. "

        makes sense, +1

        "There isn't a lot we can do there besides using cfids throughout. "

        Yeah, I think that's ultimate what we want to do. No? Separate ticket is probably best.

        Show
        Jonathan Ellis added a comment - "Detecting this condition and avoiding the wait seemed like the best course to take. " makes sense, +1 "There isn't a lot we can do there besides using cfids throughout. " Yeah, I think that's ultimate what we want to do. No? Separate ticket is probably best.
        Gary Dusbabek made changes -
        Attachment v1-0001-remove-unused-JoinMessage.txt [ 12440381 ]
        Attachment v1-0002-Refactor-so-meta-mutations-can-be-serialized-and-moved.txt [ 12440382 ]
        Attachment v1-0003-move-ConfigurationException-to-its-own-class-file.txt [ 12440383 ]
        Attachment v1-0004-Gossip-metadata-version-and-request-updates.txt [ 12440384 ]
        Attachment v1-0005-thrift-changes.txt [ 12440385 ]
        Attachment v1-0006-switch-to-reading-schema-configuration-from-storage.txt [ 12440386 ]
        Attachment v1-0007-thrift-impl.txt [ 12440387 ]
        Attachment v1-0008-use-cfid-for-all-mutations.txt [ 12440388 ]
        Attachment v1-0009-merge-w-last.txt [ 12440389 ]
        Gary Dusbabek made changes -
        Attachment v1-0001-remove-unused-JoinMessage.txt [ 12440381 ]
        Gary Dusbabek made changes -
        Attachment v1-0001-remove-unused-JoinMessage.txt [ 12439905 ]
        Gary Dusbabek made changes -
        Attachment v1-0002-Refactor-so-meta-mutations-can-be-serialized-and-moved.txt [ 12440382 ]
        Gary Dusbabek made changes -
        Attachment v1-0002-Refactor-so-meta-mutations-can-be-serialized-and-moved.txt [ 12439906 ]
        Gary Dusbabek made changes -
        Attachment v1-0003-move-ConfigurationException-to-its-own-class-file.txt [ 12440383 ]
        Gary Dusbabek made changes -
        Attachment v1-0003-move-ConfigurationException-to-its-own-class-file.txt [ 12439907 ]
        Gary Dusbabek made changes -
        Attachment v1-0004-Gossip-metadata-version-and-request-updates.txt [ 12440384 ]
        Gary Dusbabek made changes -
        Attachment v1-0004-Gossip-metadata-version-and-request-updates.txt [ 12439908 ]
        Gary Dusbabek made changes -
        Attachment v1-0005-thrift-changes.txt [ 12440385 ]
        Gary Dusbabek made changes -
        Attachment v1-0005-thrift-changes.txt [ 12439909 ]
        Gary Dusbabek made changes -
        Attachment v1-0006-switch-to-reading-schema-configuration-from-storage.txt [ 12440386 ]
        Gary Dusbabek made changes -
        Attachment v1-0006-switch-to-reading-schema-configuration-from-storage.txt [ 12439910 ]
        Gary Dusbabek made changes -
        Attachment v1-0007-thrift-impl.txt [ 12440387 ]
        Gary Dusbabek made changes -
        Attachment v1-0007-thrift-impl.txt [ 12439911 ]
        Gary Dusbabek made changes -
        Attachment v1-0008-use-cfid-for-all-mutations.txt [ 12440388 ]
        Gary Dusbabek made changes -
        Attachment v1-0008-use-cfid-for-all-mutations.txt [ 12439912 ]
        Gary Dusbabek made changes -
        Attachment v1-0009-merge-w-last.txt [ 12440389 ]
        Gary Dusbabek made changes -
        Attachment v2-0001-remove-unused-JoinMessage.txt [ 12440390 ]
        Attachment v2-0002-Refactor-so-meta-mutations-can-be-serialized-and-moved.txt [ 12440391 ]
        Attachment v2-0003-move-ConfigurationException-to-its-own-class-file.txt [ 12440392 ]
        Attachment v2-0004-Gossip-metadata-version-and-request-updates.txt [ 12440393 ]
        Attachment v2-0005-thrift-changes.txt [ 12440394 ]
        Attachment v2-0006-switch-to-reading-schema-configuration-from-storage.txt [ 12440395 ]
        Attachment v2-0007-thrift-impl.txt [ 12440396 ]
        Attachment v2-0008-use-cfid-for-all-mutations.txt [ 12440397 ]
        Hide
        Jonathan Ellis added a comment -

        I don't see either the new ANNOUNCE verb or a gossip onchange in patch 04, I guess those are going to be in other tickets?

        Show
        Jonathan Ellis added a comment - I don't see either the new ANNOUNCE verb or a gossip onchange in patch 04, I guess those are going to be in other tickets?
        Hide
        Gary Dusbabek added a comment -

        Looks like I merged it into 02. See Migration.announce().

        Show
        Gary Dusbabek added a comment - Looks like I merged it into 02. See Migration.announce().
        Hide
        Jonathan Ellis added a comment -

        Is there a way we could get the defsVersion stuff to go in SystemTable.StorageMetadata (a singleton that StorageService keeps a reference to) instead of DatabaseDescriptor, which is only for user-configurable stuff?

        Show
        Jonathan Ellis added a comment - Is there a way we could get the defsVersion stuff to go in SystemTable.StorageMetadata (a singleton that StorageService keeps a reference to) instead of DatabaseDescriptor, which is only for user-configurable stuff?
        Gary Dusbabek made changes -
        Attachment v3-0001-remove-unused-JoinMessage.txt [ 12440896 ]
        Attachment v3-0002-Refactor-so-meta-mutations-can-be-serialized-and-moved.txt [ 12440897 ]
        Attachment v3-0003-move-ConfigurationException-to-its-own-class-file.txt [ 12440898 ]
        Attachment v3-0004-Gossip-metadata-version-and-request-updates.txt [ 12440899 ]
        Attachment v3-0005-thrift-changes.txt [ 12440900 ]
        Attachment v3-0006-switch-to-reading-schema-configuration-from-storage.txt [ 12440901 ]
        Attachment v3-0007-thrift-impl.txt [ 12440902 ]
        Attachment v3-0008-use-cfid-for-all-mutations.txt [ 12440903 ]
        Attachment v3-0009-move-defsVersion-management-into-SystemTable.txt [ 12440904 ]
        Gary Dusbabek made changes -
        Attachment v2-0001-remove-unused-JoinMessage.txt [ 12440390 ]
        Gary Dusbabek made changes -
        Attachment v2-0002-Refactor-so-meta-mutations-can-be-serialized-and-moved.txt [ 12440391 ]
        Gary Dusbabek made changes -
        Attachment v2-0003-move-ConfigurationException-to-its-own-class-file.txt [ 12440392 ]
        Gary Dusbabek made changes -
        Attachment v2-0004-Gossip-metadata-version-and-request-updates.txt [ 12440393 ]
        Gary Dusbabek made changes -
        Attachment v2-0005-thrift-changes.txt [ 12440394 ]
        Gary Dusbabek made changes -
        Attachment v2-0006-switch-to-reading-schema-configuration-from-storage.txt [ 12440395 ]
        Gary Dusbabek made changes -
        Attachment v2-0007-thrift-impl.txt [ 12440396 ]
        Gary Dusbabek made changes -
        Attachment v2-0008-use-cfid-for-all-mutations.txt [ 12440397 ]
        Hide
        Gary Dusbabek added a comment - - edited

        Not easily. The current defsVersion needs to be known long before StorageService (and SystemMetadata) is initialized. I rearranged things so that DD keeps track of the initial version (from a schema load or whatever) and passes it to SS. Its all in v3-0009. From then on, SS is queried/updated when the uuid version is needed.

        I also fixed a rename bug where the replication strategy wasn't getting reloaded (in case it changed as a result of a drop+add).

        Show
        Gary Dusbabek added a comment - - edited Not easily. The current defsVersion needs to be known long before StorageService (and SystemMetadata) is initialized. I rearranged things so that DD keeps track of the initial version (from a schema load or whatever) and passes it to SS. Its all in v3-0009. From then on, SS is queried/updated when the uuid version is needed. I also fixed a rename bug where the replication strategy wasn't getting reloaded (in case it changed as a result of a drop+add).
        Hide
        Jonathan Ellis added a comment -

        Hmm. Having it in two places seems worse than having it in DD.

        + DatabaseDescriptor.setTableDefinition(ksm);
        + SystemTable.initMetadata().setDefsVersion(newVersion);

        The problem is it's not obvious to someone browsing that calling DD.sTD alone will leave the system in an inconsistent state. Let's go with the DD-only code.

        Show
        Jonathan Ellis added a comment - Hmm. Having it in two places seems worse than having it in DD. + DatabaseDescriptor.setTableDefinition(ksm); + SystemTable.initMetadata().setDefsVersion(newVersion); The problem is it's not obvious to someone browsing that calling DD.sTD alone will leave the system in an inconsistent state. Let's go with the DD-only code.
        Gary Dusbabek made changes -
        Attachment v3-0001-remove-unused-JoinMessage.txt [ 12440896 ]
        Gary Dusbabek made changes -
        Attachment v3-0002-Refactor-so-meta-mutations-can-be-serialized-and-moved.txt [ 12440897 ]
        Gary Dusbabek made changes -
        Attachment v3-0003-move-ConfigurationException-to-its-own-class-file.txt [ 12440898 ]
        Gary Dusbabek made changes -
        Attachment v3-0004-Gossip-metadata-version-and-request-updates.txt [ 12440899 ]
        Gary Dusbabek made changes -
        Attachment v3-0005-thrift-changes.txt [ 12440900 ]
        Gary Dusbabek made changes -
        Attachment v3-0006-switch-to-reading-schema-configuration-from-storage.txt [ 12440901 ]
        Gary Dusbabek made changes -
        Attachment v3-0007-thrift-impl.txt [ 12440902 ]
        Gary Dusbabek made changes -
        Attachment v3-0008-use-cfid-for-all-mutations.txt [ 12440903 ]
        Gary Dusbabek made changes -
        Attachment v3-0009-move-defsVersion-management-into-SystemTable.txt [ 12440904 ]
        Hide
        Gary Dusbabek added a comment -

        Removed 0009. replication strategy reload change is rebased into 0002. Everything is is the same.

        Show
        Gary Dusbabek added a comment - Removed 0009. replication strategy reload change is rebased into 0002. Everything is is the same.
        Hide
        Jonathan Ellis added a comment -

        +1

        Show
        Jonathan Ellis added a comment - +1
        Gary Dusbabek made changes -
        Status Patch Available [ 10002 ] Resolved [ 5 ]
        Resolution Fixed [ 1 ]
        Hide
        Hudson added a comment -

        Integrated in Cassandra #400 (See http://hudson.zones.apache.org/hudson/job/Cassandra/400/)
        use cfid for all mutations. Patch by Gary Dusbabek, reviewed by Jonthan Ellis.
        thrift impl. Patch by Gary Dusbabek, reviewed by Jonthan Ellis.
        switch to reading schema configuration from storage. Patch by Gary Dusbabek, reviewed by Jonthan Ellis.
        thrift changes. Patch by Gary Dusbabek, reviewed by Jonthan Ellis.
        Gossip metadata version and request updates. Patch by Gary Dusbabek, reviewed by Jonthan Ellis.
        move ConfigurationException to its own class file. Patch by Gary Dusbabek, reviewed by Jonthan Ellis.
        Refactor so meta mutations can be serialized and moved around. Patch by Gary Dusbabek, reviewed by Jonthan Ellis.
        remove unused JoinMessage. Patch by Gary Dusbabek, reviewed by Jonthan Ellis.

        Show
        Hudson added a comment - Integrated in Cassandra #400 (See http://hudson.zones.apache.org/hudson/job/Cassandra/400/ ) use cfid for all mutations. Patch by Gary Dusbabek, reviewed by Jonthan Ellis. thrift impl. Patch by Gary Dusbabek, reviewed by Jonthan Ellis. switch to reading schema configuration from storage. Patch by Gary Dusbabek, reviewed by Jonthan Ellis. thrift changes. Patch by Gary Dusbabek, reviewed by Jonthan Ellis. Gossip metadata version and request updates. Patch by Gary Dusbabek, reviewed by Jonthan Ellis. move ConfigurationException to its own class file. Patch by Gary Dusbabek, reviewed by Jonthan Ellis. Refactor so meta mutations can be serialized and moved around. Patch by Gary Dusbabek, reviewed by Jonthan Ellis. remove unused JoinMessage. Patch by Gary Dusbabek, reviewed by Jonthan Ellis.
        Gavin made changes -
        Workflow no-reopen-closed, patch-avail [ 12499935 ] patch-available, re-open possible [ 12751036 ]
        Gavin made changes -
        Workflow patch-available, re-open possible [ 12751036 ] reopen-resolved, no closed status, patch-avail, testing [ 12754802 ]

          People

          • Assignee:
            Gary Dusbabek
            Reporter:
            Gary Dusbabek
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development