Commit 1576367 from Dag H. Wanvik in branch 'code/trunk'
[ https://svn.apache.org/r1576367 ]
DERBY-532 Support deferrable constraints
Patch derby-532-check-constraints-2 which implements deferred CHECK
constraints and supporting tests.
The high level approach is as follows. When a violation occurs, we note the row
location in the base table of the offending row. At commit time (or when
switching a constraint to immediate), we revisit those rows using the row
locations if they are still valid, and validate those rows again. This is
achieved by positioning to the saved row locations in combination with a
specially crafted result set: ValidateCheckConstraintResultSet (see
ProjectRestrictResultSet#getNextRowCore) which positions to the offending base
row using ValidateCheckConstraintResultSet#positionScanAtRowLocation before
letting ValidateCheckConstraintResultSet read the row. If the row locations are
no longer valid, e.g. an intervening compress happened, we do a full table scan
to verify the constraints instead.
Adding a constraint in deferred constraint mode is currently sub-optimal, since
we currently do a full table scan via an internally generated "SELECT .. WHERE
NOT <constraints>", and we don't have a way the get at the row locations of the
offending rows in this case. I might add a specially tailored result set for
that purpose later.
Normally, when a row is inserted or updated, we execute a generated method which
combines evaluation of all check constraints on the table relevant for the
inserted or updated columns. This evaluation is performed using McCarthy boolean
evaluation (short-circuits as soon as result is known). This isn't optimal for
deferred constraints, as we'd need to assume all constraints were violated in
such a case. The implementation replaces the short-circuited evaluation with a
full evaluation, so we can remember exactly which constraints were violated,
cf. AndNoShortCircuitNode and SQLBoolean#throwExceptionIfImmediateAndFalse. A
violation in throwExceptionIfImmediateAndFalse when we have a deferred
constraint is noted (DMLWriteResultSet#rememberConstraint implemented by
UpdateResultSet and InsertResultSet) by adding the violation to a list for that
row. After the insert/update is completed, the set of violations is remembered
for posterity, cf. InsertResultSet#normalInsertCode and
UpdateResultSet#collectAffectedRows by inspecting the lists
Note that we currently do not note which constraints were violated *for each
individual row*, only per table in the transaction. This means that we visit
potentially more rows over again when a single constraint is changed to
immediate. This could be improved further by storing the set of violated
constraints along with the row location.
For bulk insert and deferred (see panel 1 below) insert row processing there is
special code paths, cf. InsertResultSet#offendingRowLocation which is invoked
via a callback from HeapController#load and another path in
For update, the code for deferred treatment is in in one of
UpdateResultSet#collectAffectedRows and UpdateResultSet#updateDeferredRows
depending on whether there are triggers.
The existing test ConstraintCharacteristcsTest has been built out by adding
check constraint to those fixture for which it is relevant, as well as adding
new ones which are only relevant for check constraints.
 This "deferred" refers to Derby special handling of rows in certain
situation, for example when doing an insert which uses the same table as a
source result set, we need to make sure we don't get confused and see the
incrementally inserted rows "again" as we process the original result set,
essentially we do a snapshot of the source result set, hence "deferred rows".
All regressions passed.
Detailed code comments:
Extended and refactored slightly existing mechanism for deferred primary
key/unique constraints to also cater for check constraints. Since the hash key
we used for the memory of primary key and unique constraints was the
conglomerate id of the indexes, and those are guaranteed to be disjoint from the
conglomerate ids of the base tables having deferred constraints, we can use the
same hash table to find the "memory" in the form of the disk based hash table
(BackingStoreHashtable), cf. LCC#getDeferredHashTables.--
Code to drop any deferred constraints memory in the transaction when a
constraint is dropped.-
Call back added for bulk insert in the presence of deferrable check constraints.
Extra plumbing to be able to signal to HeapController that we need to do a
callback with the inserted row location (for bulk insert)
Extra interface method, offendingRowLocation. Only implemented with meaningful
semantics for NoPutResultSetImpl which calls it for its targetResultSet, an
More parameters to getProjectRestrictResult set to do the magic mention in the
overview for that result set, pass along schema and table name to
InsertResultSet so we can remember them for check violations. They are used to
produced checking SQL statements. This may be a bit fragile, since a rename
schema or table could make those invalid. However, there is presently no RENAME
SCHEMA in Derby and the RENAME TABLE is illegal in certain cases, notably if
there is a check constraint defined on it, so the solution should be OK for
now. Also adds an interface method, getValidateCheckConstraintResultSet, to
allow the execution run-time to build one of those, cf. code generation logic in
Extra parameter to insertRow to get at the row location if needed.
Extra method throwExceptionIfImmediateAndFalse used by deferred check
constraints to make a note of all violated constraints as evaluated by the
generated method. Picked up by InsertResultSet or UpdateResultSet.
AndNoShortCircuitNode is used to represent a non-McCarthy evaluation of the
combined check constraints. See usage in DMLModStatementNode#generateCheckTree.
Extra dummy parameter added for call to super#bindConstraints
(DMLModStatementNode). Only used by insert.
Pick up the DERBY_PROPERTIES value for property "validateCheckConstraint =
<conlomerateId>" we provide to the checking query (internal syntax only)
generated by DeferredConstraintsMemory#validateCheck. The conglomerate id is
used to retrieve the violating rows information set up by
ProjectRestrictResultSet#openCore to drive ValidateCheckConstraintResultSet.
Boolean member variable to know if we have a deferrable check constraint; also
pass only schema and table name to the result set. Passed on to the
InsertConstantAction from which InsertResultSet can pick it up.
Logic to keep track of whether we are used by the special internal query to
check violated check constraints. In this case we also do not push the check
predicates down to store for simpler handling.
Code to parse a long value from "--DERBY-PROPERTIES" property.
Extra code to comply with the sane mode parse tree printing conventions.
Handle different code generation for deferrable check contraints.
Pass on more info: schema and table name + small refactoring.
Handle the new internal query to validate violated check constraints. Cf. query
Open up for check constraints.
ATCA: Special handling of adding a deferred check constraint: need different
code path to get the UUID of constraint soon enough to be able to note any
constraint violations. CCA: note any violation and remember it. We'd like to
remember that row locations of the offending rows here, but not done for now, so
at checking time, we'll need a full table scan. This can be improved upon, see
Pass on more info to InsertConstantAction and UpdateConstantAction needed by the
Drives the checking for check constraints, and picks up the result. If we have
violations and deferred constraints, we remember that. Also some refactorings to
avoid local variables shadowing globals.
Drives the checking for check constraints, and picks up the result. If we have
violations and deferred constraints, we remember that.
Removed unused method.
Drive the special result set, ValidateCheckConstraintResultSet by positioning it
correctly for each row retrieved, using the remembered row locations from
Added logic for check constraints. Also added a new check that the user don't
specify the same constraint twice, cf new test case for it.
Make some members protected rather than private, to let the new result set
ValidateCheckConstraintResultSet inherit from it.
Boiler plate to comply with interface (not used).
The new result we use to check violating rows only based on row location
New boolean to signal that we want ValidateCheckConstraintResultSet
Extra logic to handle check constraints (already had it for primary key and unique).
Utility method to determine if an exception if a transaction deferred constraint
violation. Needed by the XA code.
New error messages
New test cases and extension of present ones to include check constraints
Extension of present test cases to include check constraints.