Uploaded image for project: 'Ignite'
  1. Ignite
  2. IGNITE-19834

Thin 3.0: Schema validation

    XMLWordPrintableJSON

Details

    • Epic
    • Status: Reopened
    • Major
    • Resolution: Unresolved
    • None
    • 3.0.0-beta2
    • thin client
    • Thin 3.0: Schema validation

    Description

      Motivation

      Current Ignite 3 behavior is inconsistent when user data has unmapped columns:

      • POJO: unmapped columns (not in schema) are ignored;
      • Tuple: unmapped columns are ignored on client, but cause exception on server (when using server-side API from a Compute task).

      We should ensure consistent and reliable behavior across all APIs and clients.

      Non-goals

      • Validate column types (already handled by serializers)
      • Deal with any other schema aspects (indexes, constraints) which are not present on the client side

      Requirements

      Incompatible rows must be rejected by all APIs (Record, KeyValue, RecordBinary, KeyValueBinary):

      • Unmapped columns are present;
      • Columns without default value are missing.
      • Validation should be performed by the server when possible.
      • Unmapped columns should be validated by the client, because rows are serialized according to the schema (server does not see unmapped columns).

      Design

      Case 1: Missing Columns

      Already handled by the client and the server:

      • Client sends noValueSet to indicate which columns were not provided by the user;
      • Server rejects rows when the column is not set by the user and does not have a default value.

      Case 2: Unmapped Columns

      Server-side API

      • Fix Marshaller to reject POJOs with unmapped fields;
      • Reject tuples from client connector when schema is outdated (see explanation below).

      Client-side API
      Client serializes user rows according to the latest known schema. Unmapped columns will not reach the server side. Therefore, the client must reject unmapped columns in user rows (Tuples, POJOs).

      However, there is no guarantee that the client always has the latest schema:

      • Column might be removed on the server, but the client uses old schema and validation passes when it should fail;
        • Solution: server rejects rows with outdated schema from the client.
      • Column might be added on the server, but the client uses old schema and validation fails when it should pass.
        • Solution: when an unmapped column is detected by the client, it should request the latest schema and retry the validation to avoid false-positive exceptions.

      The fact that the server rejects rows with outdated schema from the client also simplifies client schema synchronization logic - we won't have to deal with things like IGNITE-19241 Java thin 3.0: propagate table schema updates to client on write-only operations anymore. Client will simply reload the schema when given a certain error code.

      Schemas and Transactions
      IEP-98 Schema Synchronization proposes a more complex logic of handling schema updates within transactions. This may alter the way we validate schemas on the server, but should not affect the client: if a given schema version is observed by the client, any server node should be able to handle this version potentially waiting for it to be installed before proceeding).

      Implementation Notes

      Client and server APIs implement the same interfaces. Therefore, the same tests should run against both APIs and ensure identical behavior (see ItSqlSynchronousApiTest as an example of this approach).

      Attachments

        Activity

          People

            ptupitsyn Pavel Tupitsyn
            ptupitsyn Pavel Tupitsyn
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: