Uploaded image for project: 'Cassandra'
  1. Cassandra
  2. CASSANDRA-15844

Create table Asynchronously or creating table contact the same node from many client threads at same time may causing data loss

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Normal
    • Resolution: Unresolved
    • None
    • Cluster/Schema
    • None
    • Correctness - Recoverable Corruption / Loss
    • Normal
    • Normal
    • User Report
    • All
    • None

    Description

      When creating a table from on one coordinator node from some client threads at the same time, Or creating a table using session.executeAsync() method, may cause the schema'information incorrect. Seriously will causing data lose.
      For my test. I use executeAsync() to create table one by one using the same table name (Though I do konw create table should be synchronously, but some of our customers may create table using executAsync() ). My expectations is that the last cql

      CREATE TABLE ks.tb (name text PRIMARY KEY , age int, adds text, height text)
      

      should take effect .

      But after runing the code, I foud that the result is not what I am expected. the schema struct is is :

      CREATE TABLE ks.tb (name text PRIMARY KEY , age int, adds text, sex int, height int)
      


      And the schema version in the memory and on the disk is not the same.

      When add a new columnfamily (creat a new table), the request of creating same table with different schema definition arrived at the same time from different clients or using
      executeAsync method.

       private static void announceNewColumnFamily(CFMetaData cfm, boolean announceLocally, boolean throwOnDuplicate, long timestamp) throws ConfigurationException
          {
              cfm.validate();
      
              KeyspaceMetadata ksm = Schema.instance.getKSMetaData(cfm.ksName);
              if (ksm == null)
                  throw new ConfigurationException(String.format("Cannot add table '%s' to non existing keyspace '%s'.", cfm.cfName, cfm.ksName));
              // If we have a table or a view which has the same name, we can't add a new one
              else if (throwOnDuplicate && ksm.getTableOrViewNullable(cfm.cfName) != null)
                  throw new AlreadyExistsException(cfm.ksName, cfm.cfName);
      
              logger.info("Create new table: {}", cfm);
              announce(SchemaKeyspace.makeCreateTableMutation(ksm, cfm, timestamp), announceLocally);
          }
      

      The code of checking table existance may failed. And same table's request may all going to do announce() method;

      public static synchronized void mergeSchema(Collection<Mutation> mutations, boolean forDynamoTTL)
          {
              // only compare the keyspaces affected by this set of schema mutations
              Set<String> affectedKeyspaces =
              mutations.stream()
                       .map(m -> UTF8Type.instance.compose(m.key().getKey()))
                       .collect(Collectors.toSet());
      
              // fetch the current state of schema for the affected keyspaces only
              Keyspaces before = Schema.instance.getKeyspaces(affectedKeyspaces);
      
              // apply the schema mutations and flush
              mutations.forEach(Mutation::apply);
              if (FLUSH_SCHEMA_TABLES)
                  flush();
      
      
              // fetch the new state of schema from schema tables (not applied to Schema.instance yet)
              Keyspaces after = fetchKeyspacesOnly(affectedKeyspaces);
      
              mergeSchema(before, after);
              scheduleDynamoTTLClean(forDynamoTTL, mutations);
          }
      

      For we may write the new table definition into disk, so at last we saw

      CREATE TABLE ks.tb (name text PRIMARY KEY , age int, adds text, sex int, height int)
      

      in our case.
      And we also saw the different version in memory and disk.
      when writing data we using the schema in memory, but when we doing node restart the schema definition on disk will be used. Then may causing data lose.

      Attachments

        1. schemaversion.jpg
          126 kB
          Maxwell Guo
        2. keyspace inner.jpg
          123 kB
          Maxwell Guo
        3. createkeyspace.jpg
          101 kB
          Maxwell Guo

        Activity

          People

            maxwellguo Maxwell Guo
            maxwellguo Maxwell Guo
            Maxwell Guo
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: