Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Critical Critical
    • Resolution: Fixed
    • Fix Version/s: 1.1.0
    • Component/s: Core
    • Labels:
      None

      Description

      CASSANDRA-1292 fixed multiple migrations started from the same node to properly queue themselves, but it is still possible for migrations initiated on different nodes to conflict and leave the cluster in a bad state. Since the system_add/drop/rename methods are accessible directly from the client API, they should be completely safe for concurrent use.

      It should be possible to allow for most types of concurrent migrations by converting the UUID schema ID into a VersionVectorClock (as provided by CASSANDRA-580).

      1. CASSANDRA-1391.patch
        200 kB
        Pavel Yaskevich
      2. 1391-rebased.txt
        205 kB
        Jonathan Ellis
      3. 0001-CASSANDRA-1391-main.patch
        203 kB
        Pavel Yaskevich
      4. 0002-CASSANDRA-1391-fixes.patch
        28 kB
        Pavel Yaskevich
      5. CASSANDRA-1391-v2.patch
        212 kB
        Pavel Yaskevich

        Issue Links

          Activity

          Stu Hood created issue -
          Hide
          Stu Hood added a comment -

          But, vitally, if concurrent Migrations do conflict in a way that can't be resolved, keyspaces that didn't conflict should not be affected.

          Show
          Stu Hood added a comment - But, vitally, if concurrent Migrations do conflict in a way that can't be resolved, keyspaces that didn't conflict should not be affected.
          Hide
          Jonathan Ellis added a comment -

          ISTM we can allow concurrent migrations by computing the schema ID as an md5 of the keyspaces and CFs, instead of pushing that to VC. it's exactly the kind of set-merge problem that both columns-within-a-CF and VC can handle.

          Show
          Jonathan Ellis added a comment - ISTM we can allow concurrent migrations by computing the schema ID as an md5 of the keyspaces and CFs, instead of pushing that to VC. it's exactly the kind of set-merge problem that both columns-within-a-CF and VC can handle.
          Stu Hood made changes -
          Field Original Value New Value
          Link This issue is blocked by CASSANDRA-2536 [ CASSANDRA-2536 ]
          Stu Hood made changes -
          Link This issue is blocked by CASSANDRA-2536 [ CASSANDRA-2536 ]
          Hide
          Marko Mikulicic added a comment -

          is there a way to fix this bad state? I'm not sure if this bug affects me or something similar, but my cluster cannot create new keyspaces

          "Cluster schema does not yet agree"

          I tried to drop all the nodes but one, but it still complains. Any idea?

          Show
          Marko Mikulicic added a comment - is there a way to fix this bad state? I'm not sure if this bug affects me or something similar, but my cluster cannot create new keyspaces "Cluster schema does not yet agree" I tried to drop all the nodes but one, but it still complains. Any idea?
          Hide
          Jonathan Ellis added a comment -

          Does this sound sane at all to you, Gary?

          Show
          Jonathan Ellis added a comment - Does this sound sane at all to you, Gary?
          Jonathan Ellis made changes -
          Assignee Gary Dusbabek [ gdusbabek ]
          Hide
          Gary Dusbabek added a comment -

          Not with the current state of things.

          Show
          Gary Dusbabek added a comment - Not with the current state of things.
          Hide
          Jonathan Ellis added a comment -

          What would have to change?

          Show
          Jonathan Ellis added a comment - What would have to change?
          Hide
          Gary Dusbabek added a comment - - edited

          Bottom line: we need to solve the problem of how to handle conflicts. I view this as a similar problem of handling merge conflicts in change sets: a good subset of the conflicts can be merged automatically because they are independent of each other. But every once a while there is a conflict that needs a manual edit-the solution is not computable because it isn't deterministic.

          This is currently addressed right now by strictly enforcing the relationship of a migration with its predecessor to ensure that all migrations are applied serially.

          There is probably a pragmatic approach that I'm not able to see because the merge-conflict problem makes it a non-starter for me.

          Show
          Gary Dusbabek added a comment - - edited Bottom line: we need to solve the problem of how to handle conflicts. I view this as a similar problem of handling merge conflicts in change sets: a good subset of the conflicts can be merged automatically because they are independent of each other. But every once a while there is a conflict that needs a manual edit-the solution is not computable because it isn't deterministic. This is currently addressed right now by strictly enforcing the relationship of a migration with its predecessor to ensure that all migrations are applied serially. There is probably a pragmatic approach that I'm not able to see because the merge-conflict problem makes it a non-starter for me.
          Hide
          Jonathan Ellis added a comment -

          Seems like if we could come up with some way of "ordering" schema components we could solve that: user A says "set default_validation_class=ascii", B says "d_v_c=utf8", but we semi-arbitrarily decide that ascii is higher priority so when they conflict ascii wins.

          Show
          Jonathan Ellis added a comment - Seems like if we could come up with some way of "ordering" schema components we could solve that: user A says "set default_validation_class=ascii", B says "d_v_c=utf8", but we semi-arbitrarily decide that ascii is higher priority so when they conflict ascii wins.
          Hide
          Gary Dusbabek added a comment -

          Seems like if we could come up with some way of "ordering" schema components

          A lamport-ish clock consisting of node name and counter/timestamp would probably be sufficient for ordering.

          This could be used with an approach that quarantines schema changes for a period of time. This would allow for changes to come in from throughout the cluster and would allow them to be reordered before being applied.

          I think this is sensible, but still gives some rope from which we can hang ourselves--the strict predecessor relationship is gone and we'd have to trust that nodes would be doing the right thing (by applying migrations) independently after the quarantine is over.

          Show
          Gary Dusbabek added a comment - Seems like if we could come up with some way of "ordering" schema components A lamport-ish clock consisting of node name and counter/timestamp would probably be sufficient for ordering. This could be used with an approach that quarantines schema changes for a period of time. This would allow for changes to come in from throughout the cluster and would allow them to be reordered before being applied. I think this is sensible, but still gives some rope from which we can hang ourselves--the strict predecessor relationship is gone and we'd have to trust that nodes would be doing the right thing (by applying migrations) independently after the quarantine is over.
          Hide
          Jonathan Ellis added a comment -

          A lamport-ish clock consisting of node name and counter/timestamp would probably be sufficient for ordering

          That would probably work too, although it feels weird to base it on who generated the migration, rather than the migration contents.

          we'd have to trust that nodes would be doing the right thing (by applying migrations) independently after the quarantine is over

          Right. I'm totally comfortable with this, feels like a good fit with how the rest of the system works.

          Show
          Jonathan Ellis added a comment - A lamport-ish clock consisting of node name and counter/timestamp would probably be sufficient for ordering That would probably work too, although it feels weird to base it on who generated the migration, rather than the migration contents. we'd have to trust that nodes would be doing the right thing (by applying migrations) independently after the quarantine is over Right. I'm totally comfortable with this, feels like a good fit with how the rest of the system works.
          Hide
          Jonathan Ellis added a comment -

          Thinking about this more, the really important part is that all nodes agree on the same schema no matter what order they get the Migrations in. If we can make that guarantee, the actual conflict resolution doesn't have to be particularly good (since it will still be a rare occurrence).

          So what is "the simplest thing that can work" here?

          I think we need to be able to merge Migrations at a finer granularity.

          If we do not, we have problems like this:

          • Mutation 1 (M1) says "set default_validation_class=ascii, comment='foo'" at time T1.
          • M2 says "set row_cache_size=1000000" at time T0 < T1.

          If node A gets M2 first, applies it, then gets M1, it has all 3 changes made. If node B however gets M1 first, then rejects M2 because T0 < T1 (for whatever kind of clock/comparator we are talking about), nodes A and B will end up with different schemas.

          I think wall-clock-time plus content-based tiebreaker (like we currently do with Column values) will be just as good as more complex ordering, as long as we have the fine-grained merging.

          Show
          Jonathan Ellis added a comment - Thinking about this more, the really important part is that all nodes agree on the same schema no matter what order they get the Migrations in. If we can make that guarantee, the actual conflict resolution doesn't have to be particularly good (since it will still be a rare occurrence). So what is "the simplest thing that can work" here? I think we need to be able to merge Migrations at a finer granularity. If we do not, we have problems like this: Mutation 1 (M1) says "set default_validation_class=ascii, comment='foo'" at time T1. M2 says "set row_cache_size=1000000" at time T0 < T1. If node A gets M2 first, applies it, then gets M1, it has all 3 changes made. If node B however gets M1 first, then rejects M2 because T0 < T1 (for whatever kind of clock/comparator we are talking about), nodes A and B will end up with different schemas. I think wall-clock-time plus content-based tiebreaker (like we currently do with Column values) will be just as good as more complex ordering, as long as we have the fine-grained merging.
          Jonathan Ellis made changes -
          Assignee Gary Dusbabek [ gdusbabek ] Pavel Yaskevich [ xedin ]
          Pavel Yaskevich made changes -
          Status Open [ 1 ] In Progress [ 3 ]
          Pavel Yaskevich made changes -
          Fix Version/s 1.0 [ 12316349 ]
          Hide
          Pavel Yaskevich added a comment -

          Moved all static collections that hold information about schema from DatabaseDescriptor/CFMetaData/Table classes to union o.a.c.config.DBSchema class (also refactored o.a.c.db.migration classes to support that change), that also will enable us to apply schema migrations in the isolated environment.

          Show
          Pavel Yaskevich added a comment - Moved all static collections that hold information about schema from DatabaseDescriptor/CFMetaData/Table classes to union o.a.c.config.DBSchema class (also refactored o.a.c.db.migration classes to support that change), that also will enable us to apply schema migrations in the isolated environment.
          Pavel Yaskevich made changes -
          Attachment 0001-CASSANDRA-1391-refactoring-of-the-DatabaseDescriptor.patch [ 12491071 ]
          Pavel Yaskevich made changes -
          Status In Progress [ 3 ] Patch Available [ 10002 ]
          Reviewer jbellis
          Hide
          Jonathan Ellis added a comment -

          I'd prefer "Schema" to DBSchema, and "Schema.instance" to a static final DBD.schema reference (and inlining the DBD.getX calls that just wrap a Schema method of the same name).

          Otherwise +1

          Show
          Jonathan Ellis added a comment - I'd prefer "Schema" to DBSchema, and "Schema.instance" to a static final DBD.schema reference (and inlining the DBD.getX calls that just wrap a Schema method of the same name). Otherwise +1
          Hide
          Pavel Yaskevich added a comment -

          Ok, I will do renaming and inline and commit this, thanks!

          Show
          Pavel Yaskevich added a comment - Ok, I will do renaming and inline and commit this, thanks!
          Jonathan Ellis made changes -
          Status Patch Available [ 10002 ] In Progress [ 3 ]
          Hide
          Pavel Yaskevich added a comment - - edited

          Committed 0001 with changes proposed by Jonathan. 0002 will follow up with migration merge strategy.

          Show
          Pavel Yaskevich added a comment - - edited Committed 0001 with changes proposed by Jonathan. 0002 will follow up with migration merge strategy.
          Hide
          Hudson added a comment -

          Integrated in Cassandra #1043 (See https://builds.apache.org/job/Cassandra/1043/)
          Refactoring of the DatabaseDescriptor/CFMetadata/Table to support o.a.c.config.Schema which will be handling all schema quering/mutations
          patch by Pavel Yaskevich; reviewed by Jonathan Ellis for CASSANDRA-1391

          xedin : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1160494
          Files :

          • /cassandra/trunk/src/java/org/apache/cassandra/cql/DropIndexStatement.java
          • /cassandra/trunk/test/unit/org/apache/cassandra/db/migration/SerializationsTest.java
          • /cassandra/trunk/test/long/org/apache/cassandra/db/compaction/LongCompactionSpeedTest.java
          • /cassandra/trunk/src/java/org/apache/cassandra/db/Table.java
          • /cassandra/trunk/src/java/org/apache/cassandra/db/migration/UpdateColumnFamily.java
          • /cassandra/trunk/src/java/org/apache/cassandra/cql/UpdateStatement.java
          • /cassandra/trunk/test/unit/org/apache/cassandra/db/SerializationsTest.java
          • /cassandra/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLog.java
          • /cassandra/trunk/src/java/org/apache/cassandra/db/compaction/CompactionManager.java
          • /cassandra/trunk/src/java/org/apache/cassandra/io/sstable/SSTableWriter.java
          • /cassandra/trunk/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
          • /cassandra/trunk/src/java/org/apache/cassandra/service/StorageService.java
          • /cassandra/trunk/src/java/org/apache/cassandra/cache/AutoSavingCache.java
          • /cassandra/trunk/src/java/org/apache/cassandra/db/ColumnFamily.java
          • /cassandra/trunk/src/java/org/apache/cassandra/db/RowMutation.java
          • /cassandra/trunk/test/unit/org/apache/cassandra/thrift/ThriftValidationTest.java
          • /cassandra/trunk/src/java/org/apache/cassandra/db/migration/DropKeyspace.java
          • /cassandra/trunk/src/java/org/apache/cassandra/thrift/ThriftValidation.java
          • /cassandra/trunk/src/java/org/apache/cassandra/cql/AlterTableStatement.java
          • /cassandra/trunk/src/java/org/apache/cassandra/dht/BootStrapper.java
          • /cassandra/trunk/test/unit/org/apache/cassandra/service/ConsistencyLevelTest.java
          • /cassandra/trunk/src/java/org/apache/cassandra/tools/SSTableImport.java
          • /cassandra/trunk/test/unit/org/apache/cassandra/db/DefsTest.java
          • /cassandra/trunk/test/unit/org/apache/cassandra/service/AntiEntropyServiceTestAbstract.java
          • /cassandra/trunk/test/unit/org/apache/cassandra/SchemaLoader.java
          • /cassandra/trunk/src/java/org/apache/cassandra/db/ColumnFamilySerializer.java
          • /cassandra/trunk/src/java/org/apache/cassandra/db/migration/DropColumnFamily.java
          • /cassandra/trunk/test/unit/org/apache/cassandra/locator/SimpleStrategyTest.java
          • /cassandra/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogSegment.java
          • /cassandra/trunk/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
          • /cassandra/trunk/src/java/org/apache/cassandra/db/migration/RenameKeyspace.java
          • /cassandra/trunk/src/java/org/apache/cassandra/dht/OrderPreservingPartitioner.java
          • /cassandra/trunk/test/unit/org/apache/cassandra/config/DatabaseDescriptorTest.java
          • /cassandra/trunk/src/java/org/apache/cassandra/db/SchemaCheckVerbHandler.java
          • /cassandra/trunk/test/unit/org/apache/cassandra/service/LeaveAndBootstrapTest.java
          • /cassandra/trunk/src/java/org/apache/cassandra/db/migration/Migration.java
          • /cassandra/trunk/src/java/org/apache/cassandra/cql/DeleteStatement.java
          • /cassandra/trunk/src/java/org/apache/cassandra/db/DefinitionsUpdateVerbHandler.java
          • /cassandra/trunk/src/java/org/apache/cassandra/cql/QueryProcessor.java
          • /cassandra/trunk/src/java/org/apache/cassandra/service/MigrationManager.java
          • /cassandra/trunk/src/java/org/apache/cassandra/service/AbstractCassandraDaemon.java
          • /cassandra/trunk/src/java/org/apache/cassandra/tools/SSTableExport.java
          • /cassandra/trunk/src/java/org/apache/cassandra/config/Schema.java
          • /cassandra/trunk/src/java/org/apache/cassandra/db/migration/AddKeyspace.java
          • /cassandra/trunk/src/java/org/apache/cassandra/thrift/CassandraServer.java
          • /cassandra/trunk/src/java/org/apache/cassandra/service/ReadCallback.java
          • /cassandra/trunk/src/java/org/apache/cassandra/dht/AbstractByteOrderedPartitioner.java
          • /cassandra/trunk/src/java/org/apache/cassandra/db/migration/RenameColumnFamily.java
          • /cassandra/trunk/test/unit/org/apache/cassandra/dht/BootStrapperTest.java
          • /cassandra/trunk/test/unit/org/apache/cassandra/service/MoveTest.java
          • /cassandra/trunk/src/java/org/apache/cassandra/db/DefsTable.java
          • /cassandra/trunk/src/java/org/apache/cassandra/service/StorageProxy.java
          • /cassandra/trunk/src/java/org/apache/cassandra/cql/SelectStatement.java
          • /cassandra/trunk/src/java/org/apache/cassandra/db/migration/UpdateKeyspace.java
          • /cassandra/trunk/src/java/org/apache/cassandra/db/migration/AddColumnFamily.java
          • /cassandra/trunk/src/java/org/apache/cassandra/io/sstable/SSTableReader.java
          • /cassandra/trunk/src/java/org/apache/cassandra/config/CFMetaData.java
          Show
          Hudson added a comment - Integrated in Cassandra #1043 (See https://builds.apache.org/job/Cassandra/1043/ ) Refactoring of the DatabaseDescriptor/CFMetadata/Table to support o.a.c.config.Schema which will be handling all schema quering/mutations patch by Pavel Yaskevich; reviewed by Jonathan Ellis for CASSANDRA-1391 xedin : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1160494 Files : /cassandra/trunk/src/java/org/apache/cassandra/cql/DropIndexStatement.java /cassandra/trunk/test/unit/org/apache/cassandra/db/migration/SerializationsTest.java /cassandra/trunk/test/long/org/apache/cassandra/db/compaction/LongCompactionSpeedTest.java /cassandra/trunk/src/java/org/apache/cassandra/db/Table.java /cassandra/trunk/src/java/org/apache/cassandra/db/migration/UpdateColumnFamily.java /cassandra/trunk/src/java/org/apache/cassandra/cql/UpdateStatement.java /cassandra/trunk/test/unit/org/apache/cassandra/db/SerializationsTest.java /cassandra/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLog.java /cassandra/trunk/src/java/org/apache/cassandra/db/compaction/CompactionManager.java /cassandra/trunk/src/java/org/apache/cassandra/io/sstable/SSTableWriter.java /cassandra/trunk/src/java/org/apache/cassandra/db/ColumnFamilyStore.java /cassandra/trunk/src/java/org/apache/cassandra/service/StorageService.java /cassandra/trunk/src/java/org/apache/cassandra/cache/AutoSavingCache.java /cassandra/trunk/src/java/org/apache/cassandra/db/ColumnFamily.java /cassandra/trunk/src/java/org/apache/cassandra/db/RowMutation.java /cassandra/trunk/test/unit/org/apache/cassandra/thrift/ThriftValidationTest.java /cassandra/trunk/src/java/org/apache/cassandra/db/migration/DropKeyspace.java /cassandra/trunk/src/java/org/apache/cassandra/thrift/ThriftValidation.java /cassandra/trunk/src/java/org/apache/cassandra/cql/AlterTableStatement.java /cassandra/trunk/src/java/org/apache/cassandra/dht/BootStrapper.java /cassandra/trunk/test/unit/org/apache/cassandra/service/ConsistencyLevelTest.java /cassandra/trunk/src/java/org/apache/cassandra/tools/SSTableImport.java /cassandra/trunk/test/unit/org/apache/cassandra/db/DefsTest.java /cassandra/trunk/test/unit/org/apache/cassandra/service/AntiEntropyServiceTestAbstract.java /cassandra/trunk/test/unit/org/apache/cassandra/SchemaLoader.java /cassandra/trunk/src/java/org/apache/cassandra/db/ColumnFamilySerializer.java /cassandra/trunk/src/java/org/apache/cassandra/db/migration/DropColumnFamily.java /cassandra/trunk/test/unit/org/apache/cassandra/locator/SimpleStrategyTest.java /cassandra/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogSegment.java /cassandra/trunk/src/java/org/apache/cassandra/config/DatabaseDescriptor.java /cassandra/trunk/src/java/org/apache/cassandra/db/migration/RenameKeyspace.java /cassandra/trunk/src/java/org/apache/cassandra/dht/OrderPreservingPartitioner.java /cassandra/trunk/test/unit/org/apache/cassandra/config/DatabaseDescriptorTest.java /cassandra/trunk/src/java/org/apache/cassandra/db/SchemaCheckVerbHandler.java /cassandra/trunk/test/unit/org/apache/cassandra/service/LeaveAndBootstrapTest.java /cassandra/trunk/src/java/org/apache/cassandra/db/migration/Migration.java /cassandra/trunk/src/java/org/apache/cassandra/cql/DeleteStatement.java /cassandra/trunk/src/java/org/apache/cassandra/db/DefinitionsUpdateVerbHandler.java /cassandra/trunk/src/java/org/apache/cassandra/cql/QueryProcessor.java /cassandra/trunk/src/java/org/apache/cassandra/service/MigrationManager.java /cassandra/trunk/src/java/org/apache/cassandra/service/AbstractCassandraDaemon.java /cassandra/trunk/src/java/org/apache/cassandra/tools/SSTableExport.java /cassandra/trunk/src/java/org/apache/cassandra/config/Schema.java /cassandra/trunk/src/java/org/apache/cassandra/db/migration/AddKeyspace.java /cassandra/trunk/src/java/org/apache/cassandra/thrift/CassandraServer.java /cassandra/trunk/src/java/org/apache/cassandra/service/ReadCallback.java /cassandra/trunk/src/java/org/apache/cassandra/dht/AbstractByteOrderedPartitioner.java /cassandra/trunk/src/java/org/apache/cassandra/db/migration/RenameColumnFamily.java /cassandra/trunk/test/unit/org/apache/cassandra/dht/BootStrapperTest.java /cassandra/trunk/test/unit/org/apache/cassandra/service/MoveTest.java /cassandra/trunk/src/java/org/apache/cassandra/db/DefsTable.java /cassandra/trunk/src/java/org/apache/cassandra/service/StorageProxy.java /cassandra/trunk/src/java/org/apache/cassandra/cql/SelectStatement.java /cassandra/trunk/src/java/org/apache/cassandra/db/migration/UpdateKeyspace.java /cassandra/trunk/src/java/org/apache/cassandra/db/migration/AddColumnFamily.java /cassandra/trunk/src/java/org/apache/cassandra/io/sstable/SSTableReader.java /cassandra/trunk/src/java/org/apache/cassandra/config/CFMetaData.java
          Hide
          Pavel Yaskevich added a comment -

          Update{Keyspace/ColumnFamily) migrations now update only modified fields instead of blindly copying everything, applyModels can be skipped now if it makes no modifications.

          Merge algorithm is based on isolated schema initialized from merging migration lastVersion point: merging migration applied first then all older migrations, after that Schema.instance gets safely updated.

          Rebased with latest trunk (last commit 38a55bab7ee054bce83028b5f0abda3e9e9e7ebf)

          Show
          Pavel Yaskevich added a comment - Update{Keyspace/ColumnFamily) migrations now update only modified fields instead of blindly copying everything, applyModels can be skipped now if it makes no modifications. Merge algorithm is based on isolated schema initialized from merging migration lastVersion point: merging migration applied first then all older migrations, after that Schema.instance gets safely updated. Rebased with latest trunk (last commit 38a55bab7ee054bce83028b5f0abda3e9e9e7ebf)
          Pavel Yaskevich made changes -
          Attachment CASSANDRA-1391.patch [ 12491803 ]
          Pavel Yaskevich made changes -
          Status In Progress [ 3 ] Patch Available [ 10002 ]
          Jonathan Ellis made changes -
          Attachment 0001-CASSANDRA-1391-refactoring-of-the-DatabaseDescriptor.patch [ 12491071 ]
          Hide
          Gary Dusbabek added a comment -

          Pavel, can you please rebase?

          Show
          Gary Dusbabek added a comment - Pavel, can you please rebase?
          Hide
          Pavel Yaskevich added a comment -

          rebased with trunk (last commit 0a4b1667bee674f7c0a22057cbdab97e368a20d1)

          Show
          Pavel Yaskevich added a comment - rebased with trunk (last commit 0a4b1667bee674f7c0a22057cbdab97e368a20d1)
          Pavel Yaskevich made changes -
          Attachment CASSANDRA-1391.patch [ 12492070 ]
          Hide
          Pavel Yaskevich added a comment -

          rebased with latest trunk (last commit 38a51a8d2da4c3756a5b5fe5709a1aa97300250e)

          Show
          Pavel Yaskevich added a comment - rebased with latest trunk (last commit 38a51a8d2da4c3756a5b5fe5709a1aa97300250e)
          Pavel Yaskevich made changes -
          Attachment CASSANDRA-1391.patch [ 12492269 ]
          Hide
          Pavel Yaskevich added a comment -

          I have figured out a problem with Avro - it can't correctly handle map comparison: "org.apache.avro.AvroRuntimeException: Can't compare maps!" I will try to this issue and re-attach working patch.

          Show
          Pavel Yaskevich added a comment - I have figured out a problem with Avro - it can't correctly handle map comparison: "org.apache.avro.AvroRuntimeException: Can't compare maps!" I will try to this issue and re-attach working patch.
          Pavel Yaskevich made changes -
          Status Patch Available [ 10002 ] Open [ 1 ]
          Pavel Yaskevich made changes -
          Status Open [ 1 ] In Progress [ 3 ]
          Pavel Yaskevich made changes -
          Attachment CASSANDRA-1391.patch [ 12492269 ]
          Pavel Yaskevich made changes -
          Attachment CASSANDRA-1391.patch [ 12492070 ]
          Pavel Yaskevich made changes -
          Attachment CASSANDRA-1391.patch [ 12491803 ]
          Hide
          Pavel Yaskevich added a comment -

          updated patch to fix problem with avro map comparisons. This has a side effect in a good sense - you don't need to include options that you don't want to update when you do CLI `update

          {keyspace/column family}

          `.

          Show
          Pavel Yaskevich added a comment - updated patch to fix problem with avro map comparisons. This has a side effect in a good sense - you don't need to include options that you don't want to update when you do CLI `update {keyspace/column family} `.
          Pavel Yaskevich made changes -
          Attachment CASSANDRA-1391.patch [ 12492448 ]
          Pavel Yaskevich made changes -
          Attachment CASSANDRA-1391.patch [ 12492448 ]
          Pavel Yaskevich made changes -
          Attachment CASSANDRA-1391.patch [ 12492449 ]
          Hide
          Pavel Yaskevich added a comment -

          rebased with latest trunk (last commit 38a51a8d2da4c3756a5b5fe5709a1aa97300250e)

          Show
          Pavel Yaskevich added a comment - rebased with latest trunk (last commit 38a51a8d2da4c3756a5b5fe5709a1aa97300250e)
          Pavel Yaskevich made changes -
          Status In Progress [ 3 ] Patch Available [ 10002 ]
          Pavel Yaskevich made changes -
          Attachment CASSANDRA-1391.patch [ 12492449 ]
          Hide
          Pavel Yaskevich added a comment -

          rebased.

          Show
          Pavel Yaskevich added a comment - rebased.
          Pavel Yaskevich made changes -
          Attachment CASSANDRA-1391.patch [ 12492524 ]
          Hide
          Jonathan Ellis added a comment -

          what is the isIsolated mode stuff for?

          Show
          Jonathan Ellis added a comment - what is the isIsolated mode stuff for?
          Hide
          Pavel Yaskevich added a comment -

          Isolated flag is used to indicate that migration will be applied using isolated schema so SystemTable.Schema record for LAST_MIGRATION_KEY won't be updated and no real file operations will be made. This allows us to old deserialize migrations and re-apply them one-by-one to in case on merging migration (after each re-apply information about that migration going to be updated in SystemTable.Schema and SystemTable.Migrations).

          Show
          Pavel Yaskevich added a comment - Isolated flag is used to indicate that migration will be applied using isolated schema so SystemTable.Schema record for LAST_MIGRATION_KEY won't be updated and no real file operations will be made. This allows us to old deserialize migrations and re-apply them one-by-one to in case on merging migration (after each re-apply information about that migration going to be updated in SystemTable.Schema and SystemTable.Migrations).
          Pavel Yaskevich made changes -
          Attachment CASSANDRA-1391.patch [ 12492524 ]
          Hide
          Pavel Yaskevich added a comment -

          rebased and fixed SchemaMergeTest to work post CASSANDRA-3001

          Show
          Pavel Yaskevich added a comment - rebased and fixed SchemaMergeTest to work post CASSANDRA-3001
          Pavel Yaskevich made changes -
          Attachment CASSANDRA-1391.patch [ 12492603 ]
          Hide
          Gary Dusbabek added a comment -

          Merge algorithm is based on isolated schema initialized from merging migration lastVersion point: merging migration applied first then all older migrations, after that Schema.instance gets safely updated.

          Could you clarify what you mean by "merging migration applied first, then all older migrations..."?

          It seems like a side effect of applying a migration is that it can apply other migrations. Does MigrationManager.applyMigrations() need to be updated because of this?

          What does "isolated" indicate?

          Try to put things like flushSystemTables() in a separate patch (ok on the same ticket) to make reviewing the actual changes easier.

          Would it be possible to create some unit tests for CFMD.diff()?

          Show
          Gary Dusbabek added a comment - Merge algorithm is based on isolated schema initialized from merging migration lastVersion point: merging migration applied first then all older migrations, after that Schema.instance gets safely updated. Could you clarify what you mean by "merging migration applied first, then all older migrations..."? It seems like a side effect of applying a migration is that it can apply other migrations. Does MigrationManager.applyMigrations() need to be updated because of this? What does "isolated" indicate? Try to put things like flushSystemTables() in a separate patch (ok on the same ticket) to make reviewing the actual changes easier. Would it be possible to create some unit tests for CFMD.diff()?
          Hide
          Pavel Yaskevich added a comment - - edited

          Could you clarify what you mean by "merging migration applied first, then all older migrations..."?

          Take a look at the Migration.apply() starting from line 114 and Migration.tryMerge methods

          if we detect that current migration is outdated and should be merged we do the following actions:

          • initialize isolated Schema from the point of migration's lastVersion (this sets isolated = true)
          • reload migration's system definition to reflect that isolated schema
          • call applyModels on the merging migration to apply it's schema changes
          • merge phrase:
            • read from SystemTable.Migrations all migrations that go after current
            • for each of those migrations:
              • replaces their schema with isolated (from merging migration) and reload system definition
              • call apply() method to re-write records in SystemTable.Migrations and SystemTable.Schema
          • after all migrations were applied we try to merge isolated schema with current system schema (Schema.instance)
          • flush system tables to persist changes

          It seems like a side effect of applying a migration is that it can apply other migrations. Does MigrationManager.applyMigrations() need to be updated because of this?

          No because all modifications are done using isolated schema

          What does "isolated" indicate?

          Isolated indicates that migration will be applied with isolated Schema so no real file operations are going to be made, such as snapshot, create of the keyspace directory, remove of the SSTable files etc.

          Try to put things like flushSystemTables() in a separate patch (ok on the same ticket) to make reviewing the actual changes easier.

          I see only one such a refactoring change, is it really worse splitting current patch?

          Would it be possible to create some unit tests for CFMD.diff()

          CFMD.diff is used all over the place so if it was broken other tests would fail but if you think that this is necessary I can do that.

          Show
          Pavel Yaskevich added a comment - - edited Could you clarify what you mean by "merging migration applied first, then all older migrations..."? Take a look at the Migration.apply() starting from line 114 and Migration.tryMerge methods if we detect that current migration is outdated and should be merged we do the following actions: initialize isolated Schema from the point of migration's lastVersion (this sets isolated = true) reload migration's system definition to reflect that isolated schema call applyModels on the merging migration to apply it's schema changes merge phrase: read from SystemTable.Migrations all migrations that go after current for each of those migrations: replaces their schema with isolated (from merging migration) and reload system definition call apply() method to re-write records in SystemTable.Migrations and SystemTable.Schema after all migrations were applied we try to merge isolated schema with current system schema (Schema.instance) flush system tables to persist changes It seems like a side effect of applying a migration is that it can apply other migrations. Does MigrationManager.applyMigrations() need to be updated because of this? No because all modifications are done using isolated schema What does "isolated" indicate? Isolated indicates that migration will be applied with isolated Schema so no real file operations are going to be made, such as snapshot, create of the keyspace directory, remove of the SSTable files etc. Try to put things like flushSystemTables() in a separate patch (ok on the same ticket) to make reviewing the actual changes easier. I see only one such a refactoring change, is it really worse splitting current patch? Would it be possible to create some unit tests for CFMD.diff() CFMD.diff is used all over the place so if it was broken other tests would fail but if you think that this is necessary I can do that.
          Hide
          Jonathan Ellis added a comment -

          if we detect that current migration is outdated and should be merged we do the following actions

          Why do we need to do two code paths here?

          My preference would be to model this on our Row conflict resolution: for rows, we have a single code path where distinct columns are simply merged, and for conflicting columns we pick a winner based on user-provided timestamp and, if necessary, value contents. So the result is guaranteed to be the same on all replicas no matter what order updates were received in.

          Similarly, I'd like to see schemas merge field-at-a-time in KSMetadata/CFMetadata, with commutative conflict resolution. (I suggested byte ordering of the field contents; Gary suggested using some clock value created by coordinator.)

          Seems to me this would make the "isolation" complexity go away.

          Show
          Jonathan Ellis added a comment - if we detect that current migration is outdated and should be merged we do the following actions Why do we need to do two code paths here? My preference would be to model this on our Row conflict resolution: for rows, we have a single code path where distinct columns are simply merged, and for conflicting columns we pick a winner based on user-provided timestamp and, if necessary, value contents. So the result is guaranteed to be the same on all replicas no matter what order updates were received in. Similarly, I'd like to see schemas merge field-at-a-time in KSMetadata/CFMetadata, with commutative conflict resolution. (I suggested byte ordering of the field contents; Gary suggested using some clock value created by coordinator.) Seems to me this would make the "isolation" complexity go away.
          Hide
          Pavel Yaskevich added a comment -

          Why do we need to do two code paths here? My preference would be to model this on our Row conflict resolution: for rows, we have a single code path where distinct columns are simply merged, and for conflicting columns we pick a winner based on user-provided timestamp and, if necessary, value contents. So the result is guaranteed to be the same on all replicas no matter what order updates were received in.

          Migration merging is more complex process comparing to row merging which is pretty straight-forward, current approach easily handles all possible conflits without any tie-breakers or coordinators because it simply detects what modifications where made by each of the migrations starting from merging one, combines them (modifications) together in isolated schema and updates Schema.instance so the resulting schema is guaranteed to be the same on all replicas no matter what order migrations were received in.

          Seems to me this would make the "isolation" complexity go away.

          I still think that this is the simplest solution of all proposed because actual modifications are: KSMetaData/CFMetaData.diff(...) methods to detect modified fields, one flag member Migration.isolated to indicate that migration is running in the isolated mode and one method to update system Schema.instance with resulting Schema after all migrations where applied.

          Show
          Pavel Yaskevich added a comment - Why do we need to do two code paths here? My preference would be to model this on our Row conflict resolution: for rows, we have a single code path where distinct columns are simply merged, and for conflicting columns we pick a winner based on user-provided timestamp and, if necessary, value contents. So the result is guaranteed to be the same on all replicas no matter what order updates were received in. Migration merging is more complex process comparing to row merging which is pretty straight-forward, current approach easily handles all possible conflits without any tie-breakers or coordinators because it simply detects what modifications where made by each of the migrations starting from merging one, combines them (modifications) together in isolated schema and updates Schema.instance so the resulting schema is guaranteed to be the same on all replicas no matter what order migrations were received in . Seems to me this would make the "isolation" complexity go away. I still think that this is the simplest solution of all proposed because actual modifications are: KSMetaData/CFMetaData.diff(...) methods to detect modified fields, one flag member Migration.isolated to indicate that migration is running in the isolated mode and one method to update system Schema.instance with resulting Schema after all migrations where applied.
          Hide
          Jonathan Ellis added a comment -

          You may be right, but I'd need some time to convince myself of that and it sounds from IRC that Gary would too. I'm going to push this to 1.1.

          Show
          Jonathan Ellis added a comment - You may be right, but I'd need some time to convince myself of that and it sounds from IRC that Gary would too. I'm going to push this to 1.1.
          Jonathan Ellis made changes -
          Fix Version/s 1.1 [ 12317615 ]
          Fix Version/s 1.0 [ 12316349 ]
          Affects Version/s 0.7.0 [ 12316026 ]
          Hide
          Jonathan Ellis added a comment -

          My preference would be to model this on our Row conflict resolution

          I think we can make it even simpler, by moving schema out of avro and into native CFs.

          Each KS/CF would be a row. Attributes would be columns. When columns change you run the appropriate ALTER code. If you get an update that's obsolete (applying it does not change the schema b/c it has older timestamp) then it is no-op.

          Schema version would become some kind of md5 or sha of the CF contents (all rows + all columns).

          The only problem is you need to be a little careful to open the schema CFs before anything else, but that's relatively easy I think.

          Show
          Jonathan Ellis added a comment - My preference would be to model this on our Row conflict resolution I think we can make it even simpler, by moving schema out of avro and into native CFs. Each KS/CF would be a row. Attributes would be columns. When columns change you run the appropriate ALTER code. If you get an update that's obsolete (applying it does not change the schema b/c it has older timestamp) then it is no-op. Schema version would become some kind of md5 or sha of the CF contents (all rows + all columns). The only problem is you need to be a little careful to open the schema CFs before anything else, but that's relatively easy I think.
          Hide
          T Jake Luciani added a comment -

          I think the patch is a good start but I do like Jonathans idea of moving to native CFs too.

          We just want to make sure this gets into 1.1 since it's a problem a lot of people run into.

          Regarding the current impl, my concern is missing fields added to migration structs over time. like we had happen a lot in CFMetaData conversion code.

          Could you add a test verifies all migration struct fields are accounted for in the merge logic? so if someone adds a new field and doesn't update the migration merge logic it would cause this test to fail

          Show
          T Jake Luciani added a comment - I think the patch is a good start but I do like Jonathans idea of moving to native CFs too. We just want to make sure this gets into 1.1 since it's a problem a lot of people run into. Regarding the current impl, my concern is missing fields added to migration structs over time. like we had happen a lot in CFMetaData conversion code. Could you add a test verifies all migration struct fields are accounted for in the merge logic? so if someone adds a new field and doesn't update the migration merge logic it would cause this test to fail
          Pavel Yaskevich made changes -
          Status Patch Available [ 10002 ] In Progress [ 3 ]
          Hide
          Pavel Yaskevich added a comment -

          It seems like what we really want from migrations is "schema state" before any given migration and actual modifications migration makes like "add keyspace <ks> with attributes = ...", "update <cf> with attributes = ...."

          As all of the migrations are user initiated we can easily calculate what modifications migration makes and propagate only them keeping TimeUUID as ID of the migration to identify "apply" order. As it's okay for us to require full cluster update before accepting schema modifications it makes merge a trivial task where modifications should be applied one-by-one on some "initial state" of the schema (that also allows as to remove Avro overhead from migrations). Abandoning Avro would make things less fragile because there would be no need to modify CFMetaData or any other classes to support new (or deleted) attributes.

          Show
          Pavel Yaskevich added a comment - It seems like what we really want from migrations is "schema state" before any given migration and actual modifications migration makes like "add keyspace <ks> with attributes = ...", "update <cf> with attributes = ...." As all of the migrations are user initiated we can easily calculate what modifications migration makes and propagate only them keeping TimeUUID as ID of the migration to identify "apply" order. As it's okay for us to require full cluster update before accepting schema modifications it makes merge a trivial task where modifications should be applied one-by-one on some "initial state" of the schema (that also allows as to remove Avro overhead from migrations). Abandoning Avro would make things less fragile because there would be no need to modify CFMetaData or any other classes to support new (or deleted) attributes.
          Hide
          T Jake Luciani added a comment - - edited

          By moving to CF based migration logic it would be very useful to have the logic abstracted from schemas so it can be used for other use cases.

          Migrations give you the following:

          • RF = N where N is the size of the ring.
          • All changes are "pushed" to new nodes when they join the ring.
          • previously sent data is available locally on startup
          Show
          T Jake Luciani added a comment - - edited By moving to CF based migration logic it would be very useful to have the logic abstracted from schemas so it can be used for other use cases. Migrations give you the following: RF = N where N is the size of the ring. All changes are "pushed" to new nodes when they join the ring. previously sent data is available locally on startup
          Hide
          Jonathan Ellis added a comment -

          As all of the migrations are user initiated we can easily calculate what modifications migration makes and propagate only them

          Yes, that's implied by moving to a CF-based structure I think.

          keeping TimeUUID as ID of the migration to identify "apply" order

          I don't see why you'd want to complicate things by doing this. We already have perfectly good per-column conflict resolution without it.

          it would be very useful to have the logic abstracted from schemas so it can be used for other use cases

          I'm okay in principle with introducing a EveryNodeReplicationStrategy but refactoring migrations to that is a separate step from this ticket.

          Show
          Jonathan Ellis added a comment - As all of the migrations are user initiated we can easily calculate what modifications migration makes and propagate only them Yes, that's implied by moving to a CF-based structure I think. keeping TimeUUID as ID of the migration to identify "apply" order I don't see why you'd want to complicate things by doing this. We already have perfectly good per-column conflict resolution without it. it would be very useful to have the logic abstracted from schemas so it can be used for other use cases I'm okay in principle with introducing a EveryNodeReplicationStrategy but refactoring migrations to that is a separate step from this ticket.
          Hide
          Pavel Yaskevich added a comment -

          So you suggest we replicate schema the same way as we do data?

          Show
          Pavel Yaskevich added a comment - So you suggest we replicate schema the same way as we do data?
          Hide
          Jonathan Ellis added a comment -

          It seems to make sense eventually. But for this ticket let's stick with LocalPartitioner and the existing schema replication.

          Show
          Jonathan Ellis added a comment - It seems to make sense eventually. But for this ticket let's stick with LocalPartitioner and the existing schema replication.
          Hide
          Pavel Yaskevich added a comment -

          Than we need to preserve an order of the migrations that we accept from remote nodes otherwise we don't have sufficient information to apply modifications or am I missing something? Can you please brigly describe the process how you see it?

          Show
          Pavel Yaskevich added a comment - Than we need to preserve an order of the migrations that we accept from remote nodes otherwise we don't have sufficient information to apply modifications or am I missing something? Can you please brigly describe the process how you see it?
          Hide
          Jonathan Ellis added a comment -

          Here is one way to model it:

          keyspaces

          name: { // key
            'durable_writes': bool,
            'replication_strategy' : { // composite columns ahead!
              'class': str,
              'options': {
                 // strat-dependent options
              }
            }
          }
          
          Example (w/ composite names spelled out):
          'Keyspace1': {
            'durable_writes': True,
            ('replication_strategy', 'class'): 'SimpleStrategy',
            ('replication_strategy', 'options', 'replication_factor'): 1
          }
          

          columnfamilies

          name: { // key
            'keyspace': str,
            'comparator': str,
            ... 
            'columns': { // composite!
              column name: {
                'validation_class': str,
                'index_type': str,
                'index_name': str,
                'index_options': { }
              }
            }
          }
          

          Since each option is its own column, we leverage Cassandra's own conflict resolution: it's easy to see how if client 1 changes Keyspace1.durable_writes to False, and another client changes replication_strategy to NTS, there is no conflict and life is good. But if one changes r_s to NTS and another changes it to ONTS, then the one w/ higher timestamp wins. This is important to tolerate different nodes receiving the changes in different orders, which is necessary for everyone to end up in the same state.

          Show
          Jonathan Ellis added a comment - Here is one way to model it: keyspaces name: { // key 'durable_writes': bool, 'replication_strategy' : { // composite columns ahead! 'class': str, 'options': { // strat-dependent options } } } Example (w/ composite names spelled out): 'Keyspace1': { 'durable_writes': True, ('replication_strategy', 'class'): 'SimpleStrategy', ('replication_strategy', 'options', 'replication_factor'): 1 } columnfamilies name: { // key 'keyspace': str, 'comparator': str, ... 'columns': { // composite! column name: { 'validation_class': str, 'index_type': str, 'index_name': str, 'index_options': { } } } } Since each option is its own column, we leverage Cassandra's own conflict resolution: it's easy to see how if client 1 changes Keyspace1.durable_writes to False, and another client changes replication_strategy to NTS, there is no conflict and life is good. But if one changes r_s to NTS and another changes it to ONTS, then the one w/ higher timestamp wins. This is important to tolerate different nodes receiving the changes in different orders, which is necessary for everyone to end up in the same state.
          Hide
          Jonathan Ellis added a comment - - edited

          For example, suppose we have two clients C1 and C2, and two nodes A and B. They have a columnfamily that includes the following options:

          ColumnFamily1: {
            'default_validation_class': 'bytes',
            'comment': 'an example',
            'row_cache_size': 0
          }
          

          Next, C1 and C2 update ColumnFamily1 with M1 and M2 as follows:

          M1 = ColumnFamily1: {'d_v_c': 'ascii', 'comment': 'foo'} @ T1}}
          M2 = ColumnFamily1: {'d_v_c': 'utf8', 'row_cache_size': 1000000} @ T0 < T1
          

          (Note that we have a conflict on default_validation_class.)

          Node A receives M1 first, while node B receives M2 first.

          Node A:

          (applies M1)
          ColumnFamily1: {
            'default_validation_class': 'ascii',
            'comment': 'foo',
            'row_cache_size': 0
          }
          
          (applies M2)
          ColumnFamily1: {
            'default_validation_class': 'ascii', // M2.dvc has no effect since T0 < T1
            'comment': 'foo',
            'row_cache_size': 1000000
          }
          

          Node B:

          (applies M2)
          ColumnFamily1: {
            'default_validation_class': 'utf8',
            'comment': 'an example',
            'row_cache_size': 1000000
          }
          
          (applies M1)
          ColumnFamily1: {
            'default_validation_class': 'ascii',
            'comment': 'foo',
            'row_cache_size': 1000000
          }
          

          Because timestamp-based conflict resolution is commutative, all nodes end with the same schema no matter what order they get the updates in.

          Show
          Jonathan Ellis added a comment - - edited For example, suppose we have two clients C1 and C2, and two nodes A and B. They have a columnfamily that includes the following options: ColumnFamily1: { 'default_validation_class': 'bytes', 'comment': 'an example', 'row_cache_size': 0 } Next, C1 and C2 update ColumnFamily1 with M1 and M2 as follows: M1 = ColumnFamily1: {'d_v_c': 'ascii', 'comment': 'foo'} @ T1}} M2 = ColumnFamily1: {'d_v_c': 'utf8', 'row_cache_size': 1000000} @ T0 < T1 (Note that we have a conflict on default_validation_class.) Node A receives M1 first, while node B receives M2 first. Node A: (applies M1) ColumnFamily1: { 'default_validation_class': 'ascii', 'comment': 'foo', 'row_cache_size': 0 } (applies M2) ColumnFamily1: { 'default_validation_class': 'ascii', // M2.dvc has no effect since T0 < T1 'comment': 'foo', 'row_cache_size': 1000000 } Node B: (applies M2) ColumnFamily1: { 'default_validation_class': 'utf8', 'comment': 'an example', 'row_cache_size': 1000000 } (applies M1) ColumnFamily1: { 'default_validation_class': 'ascii', 'comment': 'foo', 'row_cache_size': 1000000 } Because timestamp-based conflict resolution is commutative, all nodes end with the same schema no matter what order they get the updates in.
          Hide
          Jonathan Ellis added a comment -

          I'm enthusiastic about this approach for several reasons:

          • ultimately we end up with simpler code with less special cases, although the migration (cough) from Avro-based schema will be a pain initially
          • gets rid of Avro dependency!
          • fixes CASSANDRA-2477 ("SELECT * FROM keyspaces")
          Show
          Jonathan Ellis added a comment - I'm enthusiastic about this approach for several reasons: ultimately we end up with simpler code with less special cases, although the migration ( cough ) from Avro-based schema will be a pain initially gets rid of Avro dependency! fixes CASSANDRA-2477 ("SELECT * FROM keyspaces")
          Hide
          Pavel Yaskevich added a comment -

          New migration model uses internal 'system'.'keyspace' CF as backing storage from the state of the schema. Internal structure of the 'keyspaces' CF was described in the Jonathan's comment. Decisions about attribute updates are made according to timestamps of the columns in new that CF.

          Avro dependency is completely removed, all

          {to/from}

          Avro methods cleaned. Please make sure that you apply both patches.

          Rebased with the latest trunk (last commit 3ed1e0789f10de8e9638bec20bed2e59fa35b0d0)

          Show
          Pavel Yaskevich added a comment - New migration model uses internal 'system'.'keyspace' CF as backing storage from the state of the schema. Internal structure of the 'keyspaces' CF was described in the Jonathan's comment. Decisions about attribute updates are made according to timestamps of the columns in new that CF. Avro dependency is completely removed, all {to/from} Avro methods cleaned. Please make sure that you apply both patches. Rebased with the latest trunk (last commit 3ed1e0789f10de8e9638bec20bed2e59fa35b0d0)
          Pavel Yaskevich made changes -
          Attachment 0001-new-migration-schema-and-avro-methods-cleanup.patch [ 12505202 ]
          Attachment 0002-avro-removal.patch [ 12505203 ]
          Pavel Yaskevich made changes -
          Status In Progress [ 3 ] Patch Available [ 10002 ]
          Hide
          T Jake Luciani added a comment -

          If you remove avro how do people upgrade?

          Show
          T Jake Luciani added a comment - If you remove avro how do people upgrade?
          Hide
          Pavel Yaskevich added a comment -

          I'm planing to add a special tool which would help convert schema from avro to new model in the separate patch. I don't see a reason to hold avro in lib as core code does not use it.

          Show
          Pavel Yaskevich added a comment - I'm planing to add a special tool which would help convert schema from avro to new model in the separate patch. I don't see a reason to hold avro in lib as core code does not use it.
          Pavel Yaskevich made changes -
          Attachment 0001-new-migration-schema-and-avro-methods-cleanup.patch [ 12505202 ]
          Pavel Yaskevich made changes -
          Attachment 0002-avro-removal.patch [ 12505203 ]
          Hide
          Pavel Yaskevich added a comment -

          rebased with the lastest trunk (last commit e37bd7e8d344332ff41bd1015e6018c81ca81fa3)

          Show
          Pavel Yaskevich added a comment - rebased with the lastest trunk (last commit e37bd7e8d344332ff41bd1015e6018c81ca81fa3)
          Pavel Yaskevich made changes -
          Attachment 0001-new-migration-schema-and-avro-methods-cleanup.patch [ 12507230 ]
          Attachment 0002-avro-removal.patch [ 12507231 ]
          Hide
          Jonathan Ellis added a comment -

          Thanks, Pavel. This is getting closer. But I think continuing to use UUIDs is the wrong approach. In particular, code like this means we've failed to achieve our goal:

          .       if (newVersion.timestamp() <= lastVersion.timestamp())
                      throw new ConfigurationException("New version timestamp is not newer than the current version timestamp.");
          

          If two migrations X and Y propagate through the cluster concurrently from different coordinators, some nodes will apply X first, some Y; whichever migration has a lower timestamp will then error out on the remaining nodes and we'll end up with the same kind of version conflict snafu we encounter now.

          Here's how I think it should work:

          • Coordinator turns KsDef and CfDef objects into RowMutations by applying them to the existing (local) schema. Maybe you use something like your attributesToCheck code since you already have that written. Give that mutation a normal local timestamp (FBU.timestampMicros).

          Then each node applying the change:

          • makes a deep copy of the existing schema ColumnFamily objects
          • calls Table.apply on the migration RowMutations
          • calls ColumnFamily.diff on the new schema ColumnFamily object vs the copied one. (This is where I was going above by saying "let the existing resolve code do the work." No matter which order nodes apply X and Y in, they will always agree on the result after applying both. Note that this does not depend on X and Y getting "correctly" ordered timestamps, either.)
          • makes the appropriate Table + CFS + Schema changes dicated by the diff
          • (above obvously needs to be synchronized at least against the Table/CFS objects affected)

          Schema "version" may then be computed as an md5 of the Schema objects. (Again: goal is that nodes can apply X and Y in any order, and we don't care. So version needs to be entirely content-based, not time-based.) Probably the easiest way to do this is to just use CF.updateDigest. We can cut this down to the first 16 bytes if we need to cram it into a UUID, but I don't see a reason for that (the Thrift API uses Strings already).

          Nit: flushSystemCFs could use FBUtilities.waitOnFutures(flushes) instead of rolling its own multi-future wait.

          Show
          Jonathan Ellis added a comment - Thanks, Pavel. This is getting closer. But I think continuing to use UUIDs is the wrong approach. In particular, code like this means we've failed to achieve our goal: . if (newVersion.timestamp() <= lastVersion.timestamp()) throw new ConfigurationException( "New version timestamp is not newer than the current version timestamp." ); If two migrations X and Y propagate through the cluster concurrently from different coordinators, some nodes will apply X first, some Y; whichever migration has a lower timestamp will then error out on the remaining nodes and we'll end up with the same kind of version conflict snafu we encounter now. Here's how I think it should work: Coordinator turns KsDef and CfDef objects into RowMutations by applying them to the existing (local) schema. Maybe you use something like your attributesToCheck code since you already have that written. Give that mutation a normal local timestamp (FBU.timestampMicros). Then each node applying the change: makes a deep copy of the existing schema ColumnFamily objects calls Table.apply on the migration RowMutations calls ColumnFamily.diff on the new schema ColumnFamily object vs the copied one. (This is where I was going above by saying "let the existing resolve code do the work." No matter which order nodes apply X and Y in, they will always agree on the result after applying both. Note that this does not depend on X and Y getting "correctly" ordered timestamps, either.) makes the appropriate Table + CFS + Schema changes dicated by the diff (above obvously needs to be synchronized at least against the Table/CFS objects affected) Schema "version" may then be computed as an md5 of the Schema objects. (Again: goal is that nodes can apply X and Y in any order, and we don't care. So version needs to be entirely content-based, not time-based.) Probably the easiest way to do this is to just use CF.updateDigest. We can cut this down to the first 16 bytes if we need to cram it into a UUID, but I don't see a reason for that (the Thrift API uses Strings already). Nit: flushSystemCFs could use FBUtilities.waitOnFutures(flushes) instead of rolling its own multi-future wait.
          Hide
          Pavel Yaskevich added a comment - - edited

          I figured out that we don't really need "lastVersion" field in Migration class anymore so it's now removed. I have removed migration ordering validation from DefinitionsUpdateVerbHandler and fixed nit you mentioned. Also I think that we still need versions to be of TimeUUID type because it's useful for example in MigrationManager.rectify were we need to know if and what migrations do we want to send to others.

          Show
          Pavel Yaskevich added a comment - - edited I figured out that we don't really need "lastVersion" field in Migration class anymore so it's now removed. I have removed migration ordering validation from DefinitionsUpdateVerbHandler and fixed nit you mentioned. Also I think that we still need versions to be of TimeUUID type because it's useful for example in MigrationManager.rectify were we need to know if and what migrations do we want to send to others.
          Pavel Yaskevich made changes -
          Attachment 0003-oldVersion-removed-nit-fixed.patch [ 12507332 ]
          Hide
          Jonathan Ellis added a comment -

          That doesn't work, though: what if we have two updates at the same timestamp? I think it really does need to be content-based.

          Also, I still think using Table.apply and CF.diff is the "right" way to do this, instead of effectively duplicating that code as a special case. Are there any downsides to that approach I'm missing?

          Show
          Jonathan Ellis added a comment - That doesn't work, though: what if we have two updates at the same timestamp? I think it really does need to be content-based. Also, I still think using Table.apply and CF.diff is the "right" way to do this, instead of effectively duplicating that code as a special case. Are there any downsides to that approach I'm missing?
          Hide
          Pavel Yaskevich added a comment -

          We could compare uuids instead in the isMergingMigration method.

          How do node determine if it is ahead or behind of the ring with content based versioning? Even if able to determine state, how do you find out what migrations node needs to send/receive to get ring in sync?

          Show
          Pavel Yaskevich added a comment - We could compare uuids instead in the isMergingMigration method. How do node determine if it is ahead or behind of the ring with content based versioning? Even if able to determine state, how do you find out what migrations node needs to send/receive to get ring in sync?
          Hide
          Jonathan Ellis added a comment -

          You don't need to – just send it the current schema, and apply/diff will take care of any redundancies. (This means we don't need to worry about schema propagation taking a long time during bootstrap or rebuild of a new node, either, as in CASSANDRA-3629 or CASSANDRA-2056.)

          Show
          Jonathan Ellis added a comment - You don't need to – just send it the current schema, and apply/diff will take care of any redundancies. (This means we don't need to worry about schema propagation taking a long time during bootstrap or rebuild of a new node, either, as in CASSANDRA-3629 or CASSANDRA-2056 .)
          Hide
          Pavel Yaskevich added a comment -

          Isn't that an over-complication? Starting from step 2 in your previous comment, node would always need to do diff to all of the CF objects plus determine were any of the keyspaces deleted/added which on the other hand migrations give us for free because we always know exactly what does migration modify. Also when node starts with such content-based version and it's version does not much others, does it really know what to do - send own schema or request one?.. I also think that once ring will get to some frequency of the schema changes it would create a noticeable traffic and nodes won't be able to keep up migrating anymore...

          Show
          Pavel Yaskevich added a comment - Isn't that an over-complication? Starting from step 2 in your previous comment, node would always need to do diff to all of the CF objects plus determine were any of the keyspaces deleted/added which on the other hand migrations give us for free because we always know exactly what does migration modify. Also when node starts with such content-based version and it's version does not much others, does it really know what to do - send own schema or request one?.. I also think that once ring will get to some frequency of the schema changes it would create a noticeable traffic and nodes won't be able to keep up migrating anymore...
          Hide
          Jonathan Ellis added a comment -

          Feels more like a simplification, than a complication to me. Syncing schema becomes always sending exactly one message instead of potentially hundreds/thousands. And, we get full concurrency support instead of "mostly" (the equal-timestamp weakness I mentioned). Seems worth it to me.

          Currently, schema changes are push-only (see: SS.pushMigrations). So without changing that, yes, both nodes will send schema to each other (which will be a no-op on the newer one). That's not a blocker for me. I'd be fine switching to a pull model in either this or a followup ticket, which would let the newer side skip its pull if it recognizes the remote version as one it used to have (which would reliably indicate it's "older" even if timestamp ties are involved).

          Traffic from schema changes will be negligible for any known workload. We can optimize for that if/when it becomes a problem (my prediction: never).

          Show
          Jonathan Ellis added a comment - Feels more like a simplification, than a complication to me. Syncing schema becomes always sending exactly one message instead of potentially hundreds/thousands. And, we get full concurrency support instead of "mostly" (the equal-timestamp weakness I mentioned). Seems worth it to me. Currently, schema changes are push-only (see: SS.pushMigrations). So without changing that, yes, both nodes will send schema to each other (which will be a no-op on the newer one). That's not a blocker for me. I'd be fine switching to a pull model in either this or a followup ticket, which would let the newer side skip its pull if it recognizes the remote version as one it used to have (which would reliably indicate it's "older" even if timestamp ties are involved). Traffic from schema changes will be negligible for any known workload. We can optimize for that if/when it becomes a problem (my prediction: never).
          Hide
          Pavel Yaskevich added a comment -

          Sounds reasonable and this would also change how we do announce of the new schema, so if it's possible, could we do it in the separate ticket (or subticket)? Because this one is getting really big and I'd like to settle on the local migration handling code before we start with schema propagation changes...

          Show
          Pavel Yaskevich added a comment - Sounds reasonable and this would also change how we do announce of the new schema, so if it's possible, could we do it in the separate ticket (or subticket)? Because this one is getting really big and I'd like to settle on the local migration handling code before we start with schema propagation changes...
          Hide
          Jonathan Ellis added a comment -

          I'd rather do them together in this case, it's pretty hard to work in trunk w/o schema announce working.

          Show
          Jonathan Ellis added a comment - I'd rather do them together in this case, it's pretty hard to work in trunk w/o schema announce working.
          Pavel Yaskevich made changes -
          Attachment 0003-oldVersion-removed-nit-fixed.patch [ 12507332 ]
          Pavel Yaskevich made changes -
          Attachment 0001-new-migration-schema-and-avro-methods-cleanup.patch [ 12507230 ]
          Pavel Yaskevich made changes -
          Attachment 0002-avro-removal.patch [ 12507231 ]
          Hide
          Pavel Yaskevich added a comment - - edited

          rebased parts 1-2 and part 3 handles the new way of schema distribution, now each node after applying migration locally calculates a content-based schema version (calculated from system.keyspaces CF) and gossips it, others listen on onChange and onAlive methods and when schema update is received them match version to their and in case it does not equal node requests migrations but sending MIGRATION_REQUEST message with serialized list of the local migration ids, coordinator sends set of missing migrations (in the serialized form) in response. After migrations are received by requesting node it applies them locally one by one on the Migration stage.

          Show
          Pavel Yaskevich added a comment - - edited rebased parts 1-2 and part 3 handles the new way of schema distribution, now each node after applying migration locally calculates a content-based schema version (calculated from system.keyspaces CF) and gossips it, others listen on onChange and onAlive methods and when schema update is received them match version to their and in case it does not equal node requests migrations but sending MIGRATION_REQUEST message with serialized list of the local migration ids, coordinator sends set of missing migrations (in the serialized form) in response. After migrations are received by requesting node it applies them locally one by one on the Migration stage.
          Pavel Yaskevich made changes -
          Attachment 0001-new-migration-schema-and-avro-methods-cleanup.patch [ 12508805 ]
          Attachment 0002-avro-removal.patch [ 12508806 ]
          Attachment 0003-oldVersion-removed-new-migration-distribution-schema.patch [ 12508807 ]
          Pavel Yaskevich made changes -
          Attachment 0001-new-migration-schema-and-avro-methods-cleanup.patch [ 12508805 ]
          Pavel Yaskevich made changes -
          Attachment 0002-avro-removal.patch [ 12508806 ]
          Pavel Yaskevich made changes -
          Attachment 0003-oldVersion-removed-new-migration-distribution-schema.patch [ 12508807 ]
          Hide
          Pavel Yaskevich added a comment -

          rebased with the lastest trunk (last commit 38c04fef0a431bf29010074bad1d35d87a739c02)

          Show
          Pavel Yaskevich added a comment - rebased with the lastest trunk (last commit 38c04fef0a431bf29010074bad1d35d87a739c02)
          Pavel Yaskevich made changes -
          Attachment 0001-new-migration-schema-and-avro-methods-cleanup.patch [ 12509553 ]
          Attachment 0002-avro-removal.patch [ 12509554 ]
          Attachment 0003-oldVersion-removed-new-migration-distribution-schema.patch [ 12509555 ]
          Pavel Yaskevich made changes -
          Attachment 0003-oldVersion-removed-new-migration-distribution-schema.patch [ 12509555 ]
          Hide
          Pavel Yaskevich added a comment -

          0003 properly rebased

          Show
          Pavel Yaskevich added a comment - 0003 properly rebased
          Pavel Yaskevich made changes -
          Attachment 0003-oldVersion-removed-new-migration-distribution-schema.patch [ 12509562 ]
          Hide
          Jonathan Ellis added a comment -

          While getting rid of avro is great, replacing "serialize to avro" with "serialize to thrift" isn't as much improvement as I was hoping for. I thought we were on board with modeling the schema "natively" in system columnfamilies as sketched in https://issues.apache.org/jira/browse/CASSANDRA-1391?focusedCommentId=13149875&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13149875. Which would allow the apply/diff design I keep talking about instead of having to do that manually in UCF, allow dropping the last_migration_key indirection, pave the way for CASSANDRA-2477, and probably more simplifications.

          Is there a reason that doesn't work?

          Show
          Jonathan Ellis added a comment - While getting rid of avro is great, replacing "serialize to avro" with "serialize to thrift" isn't as much improvement as I was hoping for. I thought we were on board with modeling the schema "natively" in system columnfamilies as sketched in https://issues.apache.org/jira/browse/CASSANDRA-1391?focusedCommentId=13149875&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13149875 . Which would allow the apply/diff design I keep talking about instead of having to do that manually in UCF, allow dropping the last_migration_key indirection, pave the way for CASSANDRA-2477 , and probably more simplifications. Is there a reason that doesn't work?
          Hide
          Pavel Yaskevich added a comment -

          I still don't see how/why apply/diff would be better than migrations (which are localized actions on the schema)...

          allow dropping the last_migration_key indirection

          We can drop it even with my current patches but what does it give us? Instead of converting deserialized thrift objects to KSMetaData we would need to initialize KSMetaData/CFMetaData and populate those with parameters from db, is it really better?

          pave the way for CASSANDRA-2477

          With current patch user will be able to do "SELECT * FROM system.keyspaces" and other queries but after CASSANDRA-2474 is done because `system.keyspaces` uses composite columns.

          Show
          Pavel Yaskevich added a comment - I still don't see how/why apply/diff would be better than migrations (which are localized actions on the schema)... allow dropping the last_migration_key indirection We can drop it even with my current patches but what does it give us? Instead of converting deserialized thrift objects to KSMetaData we would need to initialize KSMetaData/CFMetaData and populate those with parameters from db, is it really better? pave the way for CASSANDRA-2477 With current patch user will be able to do "SELECT * FROM system.keyspaces" and other queries but after CASSANDRA-2474 is done because `system.keyspaces` uses composite columns.
          Hide
          Jonathan Ellis added a comment -

          Instead of converting deserialized thrift objects to KSMetaData we would need to initialize KSMetaData/CFMetaData and populate those with parameters from db, is it really better?

          At the least, it lets you re-use apply/diff instead of rewriting that, and easy CASSANDRA-2477.

          With current patch user will be able to do "SELECT * FROM system.keyspaces" and other queries but after CASSANDRA-2474 is done because `system.keyspaces` uses composite columns

          Does it? It sure looks like it uses serialized Thrift objects to me. (Looking at DefsTable.loadFromStorage.)

          Show
          Jonathan Ellis added a comment - Instead of converting deserialized thrift objects to KSMetaData we would need to initialize KSMetaData/CFMetaData and populate those with parameters from db, is it really better? At the least, it lets you re-use apply/diff instead of rewriting that, and easy CASSANDRA-2477 . With current patch user will be able to do "SELECT * FROM system.keyspaces" and other queries but after CASSANDRA-2474 is done because `system.keyspaces` uses composite columns Does it? It sure looks like it uses serialized Thrift objects to me. (Looking at DefsTable.loadFromStorage.)
          Hide
          Pavel Yaskevich added a comment -

          There is a SCHEMA_CF (where we store serialized schema state after each of the migrations) and KEYSPACES_CF which is involved in the ks/cf attribute diff process, you can create a keyspace, open a CLI and do "use system; list 'keyspaces';' to see that I don't lie about that

          Show
          Pavel Yaskevich added a comment - There is a SCHEMA_CF (where we store serialized schema state after each of the migrations) and KEYSPACES_CF which is involved in the ks/cf attribute diff process, you can create a keyspace, open a CLI and do "use system; list 'keyspaces';' to see that I don't lie about that
          Hide
          Jonathan Ellis added a comment -

          From experience, that sounds like a great way to get bugs that update one but not the other. We really need a single "source of truth."

          Show
          Jonathan Ellis added a comment - From experience, that sounds like a great way to get bugs that update one but not the other. We really need a single "source of truth."
          Hide
          Pavel Yaskevich added a comment -

          I can remove SCHEMA_CF and remove KEYSPACES_CF to SCHEMA_CF and use it to build a schema from db on startup easily.

          Show
          Pavel Yaskevich added a comment - I can remove SCHEMA_CF and remove KEYSPACES_CF to SCHEMA_CF and use it to build a schema from db on startup easily.
          Hide
          Sylvain Lebresne added a comment -

          Besides, I really think that 'get rid of thrift (or avro) internally' is a win in the long run for doing it using apply/diff if there was no other argument. But, as Jonathan, I see no reason not to apply/diff if we can instead of rewriting equivalent methods.

          Moreover, it seems to me that the apply/diff approach would mean that a schema change would basically be 'send new schema as batch mutation to all nodes' and the 'does node1 and node2 agree on schema' is just 'read node1 and node2 schema row, diff the result and send a batch mutation with whatever each node needs'. There is no need for 'schema versions' or anything, column timestamp just deal with that problem. You only ever keep one row for the schema and that's it. So it seems to me that it's basically making concurrency a non issue (because we already handle concurrency internally and through the use of column timestamps), while I don't see how those concurrency issues can be free if you use some thrift serialization.

          As Jonathan said, there may be a fundamental problem with the apply/diff approach, but I don't see any right away (and truth is I'm much more confident in C* existing data and concurrency model to handle conflicts during schema update than in the current (or any) had-hoc migration thingy).

          Show
          Sylvain Lebresne added a comment - Besides, I really think that 'get rid of thrift (or avro) internally' is a win in the long run for doing it using apply/diff if there was no other argument. But, as Jonathan, I see no reason not to apply/diff if we can instead of rewriting equivalent methods. Moreover, it seems to me that the apply/diff approach would mean that a schema change would basically be 'send new schema as batch mutation to all nodes' and the 'does node1 and node2 agree on schema' is just 'read node1 and node2 schema row, diff the result and send a batch mutation with whatever each node needs'. There is no need for 'schema versions' or anything, column timestamp just deal with that problem. You only ever keep one row for the schema and that's it. So it seems to me that it's basically making concurrency a non issue (because we already handle concurrency internally and through the use of column timestamps), while I don't see how those concurrency issues can be free if you use some thrift serialization. As Jonathan said, there may be a fundamental problem with the apply/diff approach, but I don't see any right away (and truth is I'm much more confident in C* existing data and concurrency model to handle conflicts during schema update than in the current (or any) had-hoc migration thingy).
          Hide
          Pavel Yaskevich added a comment - - edited

          Let me get this clear - migrations use apply/diff internally for their actions upon KEYSPACE_CF. 0003 patch introduces content-based schema version which is calculated from KEYSPACES_CF.

          KEYSPACES_CF layout

          name: { // key
            'keyspace': str,
            'comparator': str,
            ... 
            'columns': { // composite!
              column name: {
                'validation_class': str,
                'index_type': str,
                'index_name': str,
                'index_options': { }
              }
            }
          }
          

          Current schema distribution is switched to be pull oriented: node A, let's call it coordinator, applies migration locally and gossips its new (content-based) version to the ring. Node B checks if it's current version differs from new version of Node A and if so, it makes a migration request to coordinator by sending MIGRATION_REQUEST message with list of its local migrations attached. Coordinator upon receiving that message makes a diff between B migrations and its local and replies to B with missing migrations. The last thing for B to do is just deserialize received migrations and apply them one-by-one. Upon startup node uses onAlive gossip handler to check versions on other nodes and request missing migrations if needed.

          It feels to be better than sending the whole KEYSPACES_CF on each schema change and let receiver to decide what actions to do upon it. Migrations are good fit for that part because they know exactly what to do related to specific changes e.g. when Add Keyspace issued - load new CF definitions to the schema, open table, create corresponding directories etc.

          Show
          Pavel Yaskevich added a comment - - edited Let me get this clear - migrations use apply/diff internally for their actions upon KEYSPACE_CF. 0003 patch introduces content-based schema version which is calculated from KEYSPACES_CF. KEYSPACES_CF layout name: { // key 'keyspace': str, 'comparator': str, ... 'columns': { // composite! column name: { 'validation_class': str, 'index_type': str, 'index_name': str, 'index_options': { } } } } Current schema distribution is switched to be pull oriented: node A, let's call it coordinator, applies migration locally and gossips its new (content-based) version to the ring. Node B checks if it's current version differs from new version of Node A and if so, it makes a migration request to coordinator by sending MIGRATION_REQUEST message with list of its local migrations attached. Coordinator upon receiving that message makes a diff between B migrations and its local and replies to B with missing migrations. The last thing for B to do is just deserialize received migrations and apply them one-by-one. Upon startup node uses onAlive gossip handler to check versions on other nodes and request missing migrations if needed. It feels to be better than sending the whole KEYSPACES_CF on each schema change and let receiver to decide what actions to do upon it. Migrations are good fit for that part because they know exactly what to do related to specific changes e.g. when Add Keyspace issued - load new CF definitions to the schema, open table, create corresponding directories etc.
          Hide
          Sylvain Lebresne added a comment -

          Let me get this clear - migrations use apply/diff internally for their actions upon KEYSPACE_CF

          But then why do the patches still use thrift internally (I have only quickly eyeballed the patches but it does seem to use thrift, which seems confirmed by Jonathan comments).

          sending MIGRATION_REQUEST message with list of its local migrations attached

          Does that mean we still keep the list of all migrations (diffs) that have ever been applied? If so, I would be in favor of getting rid of it, as it seems to me we can do without (node could use the diffs between their schema and another node schema and base whatever action have to be done (directories creation, etc...) on that, be we wouldn't keep the diff afterwards).

          Current schema distribution is switched to be pull oriented

          This is probably not a huge deal but it means that schema changes will be a tad slower, based on gossip reactivity.

          Show
          Sylvain Lebresne added a comment - Let me get this clear - migrations use apply/diff internally for their actions upon KEYSPACE_CF But then why do the patches still use thrift internally (I have only quickly eyeballed the patches but it does seem to use thrift, which seems confirmed by Jonathan comments). sending MIGRATION_REQUEST message with list of its local migrations attached Does that mean we still keep the list of all migrations (diffs) that have ever been applied? If so, I would be in favor of getting rid of it, as it seems to me we can do without (node could use the diffs between their schema and another node schema and base whatever action have to be done (directories creation, etc...) on that, be we wouldn't keep the diff afterwards). Current schema distribution is switched to be pull oriented This is probably not a huge deal but it means that schema changes will be a tad slower, based on gossip reactivity.
          Hide
          Pavel Yaskevich added a comment -

          But then why do the patches still use thrift internally (I have only quickly eyeballed the patches but it does seem to use thrift, which seems confirmed by Jonathan comments).

          Thrift is only used to keep current/new state of the ks/cf inside of the migration to be used for diff upon apply, it doesn't really matter how we would keep that - thrift, json, even comma separated plain text. I was guided by the fact that we already use thrift internally and it makes it easy to get attributes of the object when needed.

          Does that mean we still keep the list of all migrations (diffs) that have ever been applied? If so, I would be in favor of getting rid of it, as it seems to me we can do without (node could use the diffs between their schema and another node schema and base whatever action have to be done (directories creation, etc...) on that, be we wouldn't keep the diff afterwards).

          I still don't get that huge deal why we need to re-implement migration logic that way, what does it really gives us comparing to migrations? Migrations already give as a straight way to tell a node what to do without involving any schema transfers or decision making.

          Show
          Pavel Yaskevich added a comment - But then why do the patches still use thrift internally (I have only quickly eyeballed the patches but it does seem to use thrift, which seems confirmed by Jonathan comments). Thrift is only used to keep current/new state of the ks/cf inside of the migration to be used for diff upon apply, it doesn't really matter how we would keep that - thrift, json, even comma separated plain text. I was guided by the fact that we already use thrift internally and it makes it easy to get attributes of the object when needed. Does that mean we still keep the list of all migrations (diffs) that have ever been applied? If so, I would be in favor of getting rid of it, as it seems to me we can do without (node could use the diffs between their schema and another node schema and base whatever action have to be done (directories creation, etc...) on that, be we wouldn't keep the diff afterwards). I still don't get that huge deal why we need to re-implement migration logic that way, what does it really gives us comparing to migrations? Migrations already give as a straight way to tell a node what to do without involving any schema transfers or decision making.
          Hide
          Jonathan Ellis added a comment -

          I think pull is the right fit for a content-based schema version:

          Currently, schema changes are push-only (see: SS.pushMigrations). So without changing that, yes, both nodes will send schema to each other (which will be a no-op on the newer one). That's not a blocker for me. I'd be fine switching to a pull model in either this or a followup ticket, which would let the newer side skip its pull if it recognizes the remote version as one it used to have (which would reliably indicate it's "older" even if timestamp ties are involved).

          Show
          Jonathan Ellis added a comment - I think pull is the right fit for a content-based schema version: Currently, schema changes are push-only (see: SS.pushMigrations). So without changing that, yes, both nodes will send schema to each other (which will be a no-op on the newer one). That's not a blocker for me. I'd be fine switching to a pull model in either this or a followup ticket, which would let the newer side skip its pull if it recognizes the remote version as one it used to have (which would reliably indicate it's "older" even if timestamp ties are involved).
          Hide
          Jonathan Ellis added a comment -

          what does it really gives us comparing to migrations?

          it should be equivalent to "one big migration" when we use apply/diff. which is a lot simpler than recording each change since the beginning of time and replaying them, and a lot more convenient when adding new nodes to do it all at once.

          Show
          Jonathan Ellis added a comment - what does it really gives us comparing to migrations? it should be equivalent to "one big migration" when we use apply/diff. which is a lot simpler than recording each change since the beginning of time and replaying them, and a lot more convenient when adding new nodes to do it all at once.
          Hide
          Pavel Yaskevich added a comment -

          It doesn't feel to simple for me to do it in one big run - where you would need to go through whole schema (possibly few times) comparing to applying a single well-defined change. We are not expecting migrations to be very frequent e.g. because special conditions should be ensured according to ks/cf getting updated, so even with 1000 (I don't know if it's even a real deal) migrations it's not a problem for a new node to get them in one request and apply sequentially.

          Show
          Pavel Yaskevich added a comment - It doesn't feel to simple for me to do it in one big run - where you would need to go through whole schema (possibly few times) comparing to applying a single well-defined change. We are not expecting migrations to be very frequent e.g. because special conditions should be ensured according to ks/cf getting updated, so even with 1000 (I don't know if it's even a real deal) migrations it's not a problem for a new node to get them in one request and apply sequentially.
          Hide
          Jonathan Ellis added a comment -

          But if it's modeled as "the migration is just a RowMutation whose CF is in the same format we store the schema internally" there is no extra code involved to do both. You can have a small "migration" rowmutation for when the user adds a column, and just send the entire schema at once for new nodes joining.

          Show
          Jonathan Ellis added a comment - But if it's modeled as "the migration is just a RowMutation whose CF is in the same format we store the schema internally" there is no extra code involved to do both. You can have a small "migration" rowmutation for when the user adds a column, and just send the entire schema at once for new nodes joining.
          Jonathan Ellis made changes -
          Status Patch Available [ 10002 ] In Progress [ 3 ]
          Jonathan Ellis made changes -
          Priority Major [ 3 ] Critical [ 2 ]
          Jonathan Ellis made changes -
          Priority Critical [ 2 ] Major [ 3 ]
          Jonathan Ellis made changes -
          Priority Major [ 3 ] Critical [ 2 ]
          Jonathan Ellis made changes -
          Link This issue relates to CASSANDRA-2477 [ CASSANDRA-2477 ]
          Pavel Yaskevich made changes -
          Attachment 0001-new-migration-schema-and-avro-methods-cleanup.patch [ 12509553 ]
          Pavel Yaskevich made changes -
          Attachment 0002-avro-removal.patch [ 12509554 ]
          Pavel Yaskevich made changes -
          Attachment 0003-oldVersion-removed-new-migration-distribution-schema.patch [ 12509562 ]
          Pavel Yaskevich made changes -
          Attachment CASSANDRA-1391.patch [ 12492603 ]
          Hide
          Pavel Yaskevich added a comment -

          Migrations are made one-time local thing, serialization/deserialization and save to system cfs are removed. Schema is propagated in from of row mutations to be applied on the system 'keyspaces' cf (apply/diff strategy), upon local user-initiated migration node uses push tactic (sends updated schema state to all alive nodes in the cluster), pull tactic (when node requests schema from remote node) is enabled only when node detects schema disagreement by means of Gossip e.g. when new node joins the ring or push was lost due to network or other circumstances. Supports automatic schema migration from old storage (after migration deprecated system 'schema' and 'migrations' column families are dropped). toAvro() methods are removed as no longer needed, fromAvro(...) methods are kept as @Deprecated to support schema migration from schema cf to new cf. 'Keyspaces' CF contain data in the human-readable JSON format.

          Show
          Pavel Yaskevich added a comment - Migrations are made one-time local thing, serialization/deserialization and save to system cfs are removed. Schema is propagated in from of row mutations to be applied on the system 'keyspaces' cf (apply/diff strategy), upon local user-initiated migration node uses push tactic (sends updated schema state to all alive nodes in the cluster), pull tactic (when node requests schema from remote node) is enabled only when node detects schema disagreement by means of Gossip e.g. when new node joins the ring or push was lost due to network or other circumstances. Supports automatic schema migration from old storage (after migration deprecated system 'schema' and 'migrations' column families are dropped). toAvro() methods are removed as no longer needed, fromAvro(...) methods are kept as @Deprecated to support schema migration from schema cf to new cf. 'Keyspaces' CF contain data in the human-readable JSON format.
          Pavel Yaskevich made changes -
          Attachment CASSANDRA-1391.patch [ 12510853 ]
          Pavel Yaskevich made changes -
          Status In Progress [ 3 ] Patch Available [ 10002 ]
          Brandon Williams made changes -
          Link This issue supercedes CASSANDRA-3750 [ CASSANDRA-3750 ]
          Hide
          Jonathan Ellis added a comment -

          rebased patch attached. comments:

          • validateSchemaAgreement is unnecessary now right?
          • the old Migration infrastructure feels unnecessarily heavyweight now. Can we move the validation into the CassandraServer methods, and then just invoke a MigrationHelper method from a runnable there?
          • should we snapshot the old avro schema before nuking it?
          • SystemTable.dropOldSchemaTables is a no-op. I think we can take this out entirely since loadSchema/fromAvro takes care of it?
          • Can you add a comment describing the layout of the new schema CFs to defstable or systemtable?
          • I'd prefer to leave the low level slicing / deserialize in SystemTable class instead of scattered between Schema and DefsTable
          Show
          Jonathan Ellis added a comment - rebased patch attached. comments: validateSchemaAgreement is unnecessary now right? the old Migration infrastructure feels unnecessarily heavyweight now. Can we move the validation into the CassandraServer methods, and then just invoke a MigrationHelper method from a runnable there? should we snapshot the old avro schema before nuking it? SystemTable.dropOldSchemaTables is a no-op. I think we can take this out entirely since loadSchema/fromAvro takes care of it? Can you add a comment describing the layout of the new schema CFs to defstable or systemtable? I'd prefer to leave the low level slicing / deserialize in SystemTable class instead of scattered between Schema and DefsTable
          Jonathan Ellis made changes -
          Attachment 1391-rebased.txt [ 12511195 ]
          Hide
          Pavel Yaskevich added a comment -

          validateSchemaAgreement is unnecessary now right?

          I think it's still a good idea to validate if all nodes have the same schema.

          the old Migration infrastructure feels unnecessarily heavyweight now. Can we move the validation into the CassandraServer methods, and then just invoke a MigrationHelper method from a runnable there?

          I tried to optimize it as much as possible because I still think that there is a reason to keep it because it encapsulates all announce, apply and validation logic pretty good. I tried to move validation and stuff to the CassandraServer but it shows itself as hardly readable and heavy-weight.

          should we snapshot the old avro schema before nuking it?

          MigrationHelper.dropColumnFamily that I call to remove Migrations and Schema CFs makes snapshot of the data.

          SystemTable.dropOldSchemaTables is a no-op. I think we can take this out entirely since loadSchema/fromAvro takes care of it?

          Ugh, I forgot to remove it from the final version of the patch, sorry...

          Can you add a comment describing the layout of the new schema CFs to defstable or systemtable?

          Sure, I will do that in SystemTable.

          I'd prefer to leave the low level slicing / deserialize in SystemTable class instead of scattered between Schema and DefsTable

          Sure, I will move serialize and serialized methods from Schema to SystemTable, plus DefsTable.readSchemaRow and getSchema also go there.

          Show
          Pavel Yaskevich added a comment - validateSchemaAgreement is unnecessary now right? I think it's still a good idea to validate if all nodes have the same schema. the old Migration infrastructure feels unnecessarily heavyweight now. Can we move the validation into the CassandraServer methods, and then just invoke a MigrationHelper method from a runnable there? I tried to optimize it as much as possible because I still think that there is a reason to keep it because it encapsulates all announce, apply and validation logic pretty good. I tried to move validation and stuff to the CassandraServer but it shows itself as hardly readable and heavy-weight. should we snapshot the old avro schema before nuking it? MigrationHelper.dropColumnFamily that I call to remove Migrations and Schema CFs makes snapshot of the data. SystemTable.dropOldSchemaTables is a no-op. I think we can take this out entirely since loadSchema/fromAvro takes care of it? Ugh, I forgot to remove it from the final version of the patch, sorry... Can you add a comment describing the layout of the new schema CFs to defstable or systemtable? Sure, I will do that in SystemTable. I'd prefer to leave the low level slicing / deserialize in SystemTable class instead of scattered between Schema and DefsTable Sure, I will move serialize and serialized methods from Schema to SystemTable, plus DefsTable.readSchemaRow and getSchema also go there.
          Hide
          Pavel Yaskevich added a comment -

          Rebased main patch and fixes from Jonathan's previous comment: dropped unused method from SystemTable - dropOldSchemaTables, added KEYSPACES_CF layout information to DefsTable class header (made comment about that in the SystemTable), combined all low-level methods from Schema and DefsTable into SystemTable.

          Show
          Pavel Yaskevich added a comment - Rebased main patch and fixes from Jonathan's previous comment: dropped unused method from SystemTable - dropOldSchemaTables, added KEYSPACES_CF layout information to DefsTable class header (made comment about that in the SystemTable), combined all low-level methods from Schema and DefsTable into SystemTable.
          Pavel Yaskevich made changes -
          Attachment 0001-CASSANDRA-1391-main.patch [ 12511253 ]
          Attachment 0002-CASSANDRA-1391-fixes.patch [ 12511254 ]
          Hide
          Jonathan Ellis added a comment -

          Thanks, that helps a lot.

          I think we can make life easier for clients dealing with CASSANDRA-2477 if we split out the columns into a separate CF, and adjust how we use composites for the columnfamilies cf:

          schema_keyspaces
          ----------------
          schema_keyspaces
          ----------------
          RowKey: ks
            => (column=durable_writes, value=true, timestamp=1327061028312185000)
            => (column=name, value="ks", timestamp=1327061028312185000)
            => (column=replication_factor, value=0, timestamp=1327061028312185000)
            => (column=strategy_class, value="org.apache.cassandra.locator.NetworkTopologyStrategy", timestamp=1327061028312185000)
            => (column=strategy_options, value={"datacenter1":"1"}, timestamp=1327061028312185000)
          
          schema_columnfamilies
          ---------------------
          RowKey: ks
            => (column=cf:bloom_filter_fp_chance, value=0.0, timestamp=1327061105833119000)
            => (column=cf:caching, value="NONE", timestamp=1327061105833119000)
            => (column=cf:column_type, value="Standard", timestamp=1327061105833119000)
            => (column=cf:comment, value="ColumnFamily", timestamp=1327061105833119000)
            => (column=cf:default_validation_class, value="org.apache.cassandra.db.marshal.BytesType", timestamp=1327061105833119000)
            => (column=cf:gc_grace_seconds, value=864000, timestamp=1327061105833119000)
            => (column=cf:id, value=1000, timestamp=1327061105833119000)
            => (column=cf:key_alias, value="S0VZ", timestamp=1327061105833119000)
          
          
          schema_columns
          --------------
          RowKey: ks
            => (column=cf:c:index_name, value=null, timestamp=1327061105833119000)
            => (column=cf:c:index_options, value=null, timestamp=1327061105833119000)
            => (column=cf:c:index_type, value=null, timestamp=1327061105833119000)
            => (column=cf:c:name, value="aGVsbG8=", timestamp=1327061105833119000)
            => (column=cf:c:validation_class, value="org.apache.cassandra.db.marshal.AsciiType", timestamp=1327061105833119000)
          
          

          This will be more forwards-compatible with CASSANDRA-2474/CQL 3.0, since these correspond to tables having PRIMARY KEY (keyspace, columnfamily) and PRIMARY KEY (keyspace, columnfamily, column), respectively.

          This also has the side benefit of grouping everything for a single keyspace under the same row key, which means it will be a single atomic RowMutation.

          I think leaving strategy_options as json is fine.

          Show
          Jonathan Ellis added a comment - Thanks, that helps a lot. I think we can make life easier for clients dealing with CASSANDRA-2477 if we split out the columns into a separate CF, and adjust how we use composites for the columnfamilies cf: schema_keyspaces ---------------- schema_keyspaces ---------------- RowKey: ks => (column=durable_writes, value=true, timestamp=1327061028312185000) => (column=name, value="ks", timestamp=1327061028312185000) => (column=replication_factor, value=0, timestamp=1327061028312185000) => (column=strategy_class, value="org.apache.cassandra.locator.NetworkTopologyStrategy", timestamp=1327061028312185000) => (column=strategy_options, value={"datacenter1":"1"}, timestamp=1327061028312185000) schema_columnfamilies --------------------- RowKey: ks => (column=cf:bloom_filter_fp_chance, value=0.0, timestamp=1327061105833119000) => (column=cf:caching, value="NONE", timestamp=1327061105833119000) => (column=cf:column_type, value="Standard", timestamp=1327061105833119000) => (column=cf:comment, value="ColumnFamily", timestamp=1327061105833119000) => (column=cf:default_validation_class, value="org.apache.cassandra.db.marshal.BytesType", timestamp=1327061105833119000) => (column=cf:gc_grace_seconds, value=864000, timestamp=1327061105833119000) => (column=cf:id, value=1000, timestamp=1327061105833119000) => (column=cf:key_alias, value="S0VZ", timestamp=1327061105833119000) schema_columns -------------- RowKey: ks => (column=cf:c:index_name, value=null, timestamp=1327061105833119000) => (column=cf:c:index_options, value=null, timestamp=1327061105833119000) => (column=cf:c:index_type, value=null, timestamp=1327061105833119000) => (column=cf:c:name, value="aGVsbG8=", timestamp=1327061105833119000) => (column=cf:c:validation_class, value="org.apache.cassandra.db.marshal.AsciiType", timestamp=1327061105833119000) This will be more forwards-compatible with CASSANDRA-2474 /CQL 3.0, since these correspond to tables having PRIMARY KEY (keyspace, columnfamily) and PRIMARY KEY (keyspace, columnfamily, column), respectively. This also has the side benefit of grouping everything for a single keyspace under the same row key, which means it will be a single atomic RowMutation. I think leaving strategy_options as json is fine.
          Hide
          Brandon Williams added a comment -

          I get exceptions while inducing concurrent schema changes:

          ERROR 20:58:10,904 Fatal exception in thread Thread[MigrationStage:1,5,main]
          java.lang.NullPointerException
                  at org.apache.cassandra.config.ColumnDefinition.toMap(ColumnDefinition.java:150)
                  at org.apache.cassandra.config.CFMetaData.diff(CFMetaData.java:978)
                  at org.apache.cassandra.db.migration.MigrationHelper.updateColumnFamily(MigrationHelper.java:285)
                  at org.apache.cassandra.db.migration.MigrationHelper.updateColumnFamily(MigrationHelper.java:200)
                  at org.apache.cassandra.db.DefsTable.mergeRemoteSchema(DefsTable.java:360)
                  at org.apache.cassandra.db.DefinitionsUpdateVerbHandler$1.runMayThrow(DefinitionsUpdateVerbHandler.java:48)
                  at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
                  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
                  at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
                  at java.util.concurrent.FutureTask.run(FutureTask.java:138)
                  at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
                  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
                  at java.lang.Thread.run(Thread.java:662)
          
          Show
          Brandon Williams added a comment - I get exceptions while inducing concurrent schema changes: ERROR 20:58:10,904 Fatal exception in thread Thread[MigrationStage:1,5,main] java.lang.NullPointerException at org.apache.cassandra.config.ColumnDefinition.toMap(ColumnDefinition.java:150) at org.apache.cassandra.config.CFMetaData.diff(CFMetaData.java:978) at org.apache.cassandra.db.migration.MigrationHelper.updateColumnFamily(MigrationHelper.java:285) at org.apache.cassandra.db.migration.MigrationHelper.updateColumnFamily(MigrationHelper.java:200) at org.apache.cassandra.db.DefsTable.mergeRemoteSchema(DefsTable.java:360) at org.apache.cassandra.db.DefinitionsUpdateVerbHandler$1.runMayThrow(DefinitionsUpdateVerbHandler.java:48) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662)
          Hide
          Pavel Yaskevich added a comment -

          Jonathan: Sure, I will do that, although would it be better to name ColumnFamilies using camel-case like SchemaKeyspaces, SchemaColumnFamilies, SchemaColumns instead?

          Brandon: can you please describe the situation when that happend, have you deleted all of the columns in update? It seems like I just forgot to add if (columnDefs == null) return <empty map>; case in ColumnDefition.toMap(List<ColumnDef>) method.

          Show
          Pavel Yaskevich added a comment - Jonathan: Sure, I will do that, although would it be better to name ColumnFamilies using camel-case like SchemaKeyspaces, SchemaColumnFamilies, SchemaColumns instead? Brandon: can you please describe the situation when that happend, have you deleted all of the columns in update? It seems like I just forgot to add if (columnDefs == null) return <empty map>; case in ColumnDefition.toMap(List<ColumnDef>) method.
          Hide
          Brandon Williams added a comment -

          can you please describe the situation when that happened

          It was pretty simple, I was just getting warmed up I issued a creation of a CF on three machines at once; one got a schema disagreement and the other two received this exception.

          Show
          Brandon Williams added a comment - can you please describe the situation when that happened It was pretty simple, I was just getting warmed up I issued a creation of a CF on three machines at once; one got a schema disagreement and the other two received this exception.
          Hide
          Pavel Yaskevich added a comment -

          Ok, gotcha I will add null check to ColumnDefinition, that I mentioned previously, and re-test everything once again when done with changes requested by Jonathan.

          Show
          Pavel Yaskevich added a comment - Ok, gotcha I will add null check to ColumnDefinition, that I mentioned previously, and re-test everything once again when done with changes requested by Jonathan.
          Hide
          Sylvain Lebresne added a comment -

          although would it be better to name ColumnFamilies using camel-case

          Making them fully lowercase internally would make them case-insensitive without any work using the current patch for CASSANDRA-3761. It's not really a big deal in any case because 1) the patch for CASSANDRA-3761 is not yet committed so it could change and 2) it won't be very hard to had some special casing for those if we wish so. But if nobody has a preference, I would suggest calling them 'keyspaces' and 'columnfamilies' directly.

          Show
          Sylvain Lebresne added a comment - although would it be better to name ColumnFamilies using camel-case Making them fully lowercase internally would make them case-insensitive without any work using the current patch for CASSANDRA-3761 . It's not really a big deal in any case because 1) the patch for CASSANDRA-3761 is not yet committed so it could change and 2) it won't be very hard to had some special casing for those if we wish so. But if nobody has a preference, I would suggest calling them 'keyspaces' and 'columnfamilies' directly.
          Hide
          Pavel Yaskevich added a comment -

          I'm fine dropping "schema_" prefix and going with "keyspaces", "columnfamilies" but how do we name "columns" cf then, something like "columnfamily_columns"?

          Show
          Pavel Yaskevich added a comment - I'm fine dropping "schema_" prefix and going with "keyspaces", "columnfamilies" but how do we name "columns" cf then, something like "columnfamily_columns"?
          Hide
          Jonathan Ellis added a comment -

          It's not a big deal, but IMO undescore + lowercase fits better with CQL 3.0 making everything case-insensitive by default.

          Show
          Jonathan Ellis added a comment - It's not a big deal, but IMO undescore + lowercase fits better with CQL 3.0 making everything case-insensitive by default.
          Hide
          Pavel Yaskevich added a comment -

          Works for me, so be it "schema_keyspaces", "schema_columnfamilies" and "schema_columns".

          Show
          Pavel Yaskevich added a comment - Works for me, so be it "schema_keyspaces", "schema_columnfamilies" and "schema_columns".
          Jonathan Ellis made changes -
          Status Patch Available [ 10002 ] In Progress [ 3 ]
          Hide
          Jonathan Ellis added a comment - - edited

          Is this going to be done today or tomorrow for 1.1 freeze?

          Show
          Jonathan Ellis added a comment - - edited Is this going to be done today or tomorrow for 1.1 freeze?
          Hide
          Pavel Yaskevich added a comment -

          Absolutely! I'm finishing up few last things and going to attach a patch in few hours.

          Show
          Pavel Yaskevich added a comment - Absolutely! I'm finishing up few last things and going to attach a patch in few hours.
          Hide
          Pavel Yaskevich added a comment -

          squashed previous two comments in one and added schema differentiation between schema_keyspaces, schema_columnfamilies and schema_columns.

          Show
          Pavel Yaskevich added a comment - squashed previous two comments in one and added schema differentiation between schema_keyspaces, schema_columnfamilies and schema_columns.
          Pavel Yaskevich made changes -
          Attachment CASSANDRA-1391-v2.patch [ 12511707 ]
          Hide
          Brandon Williams added a comment -

          No exceptions while testing v2.

          Show
          Brandon Williams added a comment - No exceptions while testing v2.
          Pavel Yaskevich made changes -
          Status In Progress [ 3 ] Patch Available [ 10002 ]
          Hide
          Jonathan Ellis added a comment -

          LGTM, +1.

          Show
          Jonathan Ellis added a comment - LGTM, +1.
          Hide
          Pavel Yaskevich added a comment -

          Committed.

          Show
          Pavel Yaskevich added a comment - Committed.
          Pavel Yaskevich made changes -
          Status Patch Available [ 10002 ] Resolved [ 5 ]
          Resolution Fixed [ 1 ]
          Gavin made changes -
          Workflow no-reopen-closed, patch-avail [ 12517994 ] patch-available, re-open possible [ 12748968 ]
          Gavin made changes -
          Workflow patch-available, re-open possible [ 12748968 ] reopen-resolved, no closed status, patch-avail, testing [ 12756755 ]

            People

            • Assignee:
              Pavel Yaskevich
              Reporter:
              Stu Hood
              Reviewer:
              Jonathan Ellis
            • Votes:
              3 Vote for this issue
              Watchers:
              12 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development