Uploaded image for project: 'Ignite'
  1. Ignite
  2. IGNITE-21003

Switch to the 'always use current schema' approach

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None

    Description

      This issue is to think about the problem and design it first, not about implementing it right away.

      Currently, for an RW transaction, we take schema at the beginning of the transaction and run the whole transaction on that schema (indices are an exception, but this is not visible to the end user). The transaction still notices most schema changes (if a change happens before a read/write in the transaction, the transaction gets aborted), but it does not notice a table created after the transaction had started.

      IGNITE-20107 addressed this issue, but it was decided that we need to design this in more depth.

      An alternative is to always use the latest schema on each operation (still having schema validation). This might have some downsides/bring difficulties:

      1. Same query might return data with different schemas
      2. It's not clear against which schema to validate the current schema at execution of each operation (probably, still against the initial one, defined as Max(txStartTs, tableCreationTs))
      3. A query would have to use the same query timestamp on each node executing its fragments, so it would have to propagate this timestamp

      The schema synchronization design was created starting with an assumption that the schema is taken for the start of a transaction, so the design should be revised carefully when switching to the proposed one.

      Proposed changes

      It seems that this can be achieved with the following:

      1. Always execute a KV/SQL operation/query using the current schema (obtained using the schema sync procedure for now())
      2. An operation/query executed distributively (like SQL queries that produce a few fragments executed on different nodes) must pass that query timestamp to each of the nodes participating in its execution; they must use this timestamp to get the 'current' schema
      3. In each read/write operation processed in a PartitionReplicaListener, instead of failing the operation if the current schema is different from the initial transaction schema, do the (already implemented) forward compatibility check (so a few white-listed change types, like adding a nullable column, will be allowed). This is optional, but the 'fail any read/write if theĀ  table schema is changed in any way' rule was introduced to disallow a user see any effects of a DDL in the middle of the transaction (except for its abortion); if we allow each query see new schema, this strict rule kinda makes no sense anymore. (On commit, we still do forward compatibility check)
      4. To do the commit/read/write forward compatibility check, take the initial table schema not at transaciton start, but at the moment when the transaction had first enlisted the table. For this, we might need a mechanism to pass the 'tableId->enlistTs' map with each transactional operation/query (back and forth), analogously to how it's done to maintain maxObservableTimestamp.
      5. Probably we should make column rename forward incompatible as the user will now have to switch to the new name immediately.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              rpuch Roman Puchkovskiy
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated: