Details
-
Epic
-
Status: Reopened
-
Major
-
Resolution: Unresolved
-
None
-
Thin 3.0: Schema validation
Description
Motivation
Current Ignite 3 behavior is inconsistent when user data has unmapped columns:
- POJO: unmapped columns (not in schema) are ignored;
- Tuple: unmapped columns are ignored on client, but cause exception on server (when using server-side API from a Compute task).
We should ensure consistent and reliable behavior across all APIs and clients.
Non-goals
- Validate column types (already handled by serializers)
- Deal with any other schema aspects (indexes, constraints) which are not present on the client side
Requirements
Incompatible rows must be rejected by all APIs (Record, KeyValue, RecordBinary, KeyValueBinary):
- Unmapped columns are present;
- Columns without default value are missing.
- Validation should be performed by the server when possible.
- Unmapped columns should be validated by the client, because rows are serialized according to the schema (server does not see unmapped columns).
Design
Case 1: Missing Columns
Already handled by the client and the server:
- Client sends noValueSet to indicate which columns were not provided by the user;
- Server rejects rows when the column is not set by the user and does not have a default value.
Case 2: Unmapped Columns
Server-side API
- Fix Marshaller to reject POJOs with unmapped fields;
- Reject tuples from client connector when schema is outdated (see explanation below).
Client-side API
Client serializes user rows according to the latest known schema. Unmapped columns will not reach the server side. Therefore, the client must reject unmapped columns in user rows (Tuples, POJOs).
However, there is no guarantee that the client always has the latest schema:
- Column might be removed on the server, but the client uses old schema and validation passes when it should fail;
- Solution: server rejects rows with outdated schema from the client.
- Column might be added on the server, but the client uses old schema and validation fails when it should pass.
- Solution: when an unmapped column is detected by the client, it should request the latest schema and retry the validation to avoid false-positive exceptions.
The fact that the server rejects rows with outdated schema from the client also simplifies client schema synchronization logic - we won't have to deal with things like IGNITE-19241 Java thin 3.0: propagate table schema updates to client on write-only operations anymore. Client will simply reload the schema when given a certain error code.
Schemas and Transactions
IEP-98 Schema Synchronization proposes a more complex logic of handling schema updates within transactions. This may alter the way we validate schemas on the server, but should not affect the client: if a given schema version is observed by the client, any server node should be able to handle this version potentially waiting for it to be installed before proceeding).
Implementation Notes
Client and server APIs implement the same interfaces. Therefore, the same tests should run against both APIs and ensure identical behavior (see ItSqlSynchronousApiTest as an example of this approach).