[IGNITE-21003] Switch to the 'always use current schema' approach - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: None
Fix Version/s: None
Component/s: None
Labels:
- ignite-3

Epic Link:
Schema synchronization, Part 3

Description

This issue is to think about the problem and design it first, not about implementing it right away.

Currently, for an RW transaction, we take schema at the beginning of the transaction and run the whole transaction on that schema (indices are an exception, but this is not visible to the end user). The transaction still notices most schema changes (if a change happens before a read/write in the transaction, the transaction gets aborted), but it does not notice a table created after the transaction had started.

IGNITE-20107 addressed this issue, but it was decided that we need to design this in more depth.

An alternative is to always use the latest schema on each operation (still having schema validation). This might have some downsides/bring difficulties:

Same query might return data with different schemas
It's not clear against which schema to validate the current schema at execution of each operation (probably, still against the initial one, defined as Max(txStartTs, tableCreationTs))
A query would have to use the same query timestamp on each node executing its fragments, so it would have to propagate this timestamp

The schema synchronization design was created starting with an assumption that the schema is taken for the start of a transaction, so the design should be revised carefully when switching to the proposed one.

Proposed changes

It seems that this can be achieved with the following:

Always execute a KV/SQL operation/query using the current schema (obtained using the schema sync procedure for now())
An operation/query executed distributively (like SQL queries that produce a few fragments executed on different nodes) must pass that query timestamp to each of the nodes participating in its execution; they must use this timestamp to get the 'current' schema
In each read/write operation processed in a PartitionReplicaListener, instead of failing the operation if the current schema is different from the initial transaction schema, do the (already implemented) forward compatibility check (so a few white-listed change types, like adding a nullable column, will be allowed). This is optional, but the 'fail any read/write if the table schema is changed in any way' rule was introduced to disallow a user see any effects of a DDL in the middle of the transaction (except for its abortion); if we allow each query see new schema, this strict rule kinda makes no sense anymore. (On commit, we still do forward compatibility check)
To do the commit/read/write forward compatibility check, take the initial table schema not at transaciton start, but at the moment when the transaction had first enlisted the table. For this, we might need a mechanism to pass the 'tableId->enlistTs' map with each transactional operation/query (back and forth), analogously to how it's done to maintain maxObservableTimestamp.
Probably we should make column rename forward incompatible as the user will now have to switch to the new name immediately.

Attachments

Issue Links

relates to

IGNITE-20107 Make table created after tx started visible to the tx

Open

IGNITE-20108 Use tableEnlistTs as baseTs when validating schema

Open

Activity

People

Assignee:: Unassigned

Reporter:: Roman Puchkovskiy

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 01/Dec/23 14:33

Updated:: 04/Dec/23 07:09