Attaching derby-481-04-aa-insert.diff. I am running regression tests now.
This patch wires in INSERT support for generated columns. I threaded my way through the INSERT machinery largely by following the way that CHECK constraints are handled.
Before this patch, the compiler built 2 significant methods for evaluating expressions:
1) A method which populates the base row from whatever data source is driving the INSERT. That data source could be, for instance, a list of literal values or a SELECT statement.
2) A method which runs the CHECK constraints.
My first attempt to support INSERT involved building the generation clauses into method (1). Unfortunately, that method is generated by the data sources, not by the driving INSERT node. I got this approach to work for the degenerate case of inserting a single literal value. But this approach failed when I tried to insert multiple literal values (where the data source is a UNION) and it failed when the data source was a SELECT. It became apparent that this approach would involve wiring code-generation logic into all implementations of ResultSet--there are quite a few. This began to look too complicated so I abandoned this approach.
The current patch represents a second attempt. Here the approach is to give the generation clauses their own method. Now the compiler builds 3 significant methods for evaluating expressions:
1') The original method which populates the base row from a data source (see above).
2') A new method which runs the generation clauses, looking for referenced columns in the row built by (1') and poking the generated values into that row.
3') The original method which runs the CHECK constraints (see above).
That was the tricky bit for compilation.
The tricky bit for execution was this: the base row has to be poked into the Activation so that it is visible to the generation clauses when (2') runs. A similar poking is done for CHECK constraints. If you examine this poking for CHECK constraints, you will notice that sometimes the poking is undone after the constraints run and sometimes we don't bother to undo the poking. I don't understand the difference between these code paths. As a result, I have defensively coded the new poking which we need for generated columns. I poke the base row into the Activation just before the generation clauses run. After the generation clauses run, I return the Activation to its previous state.
Here is a little more detail on the implementation:
A) At bind() time we do the following:
i) Prune out explicit mentions of generated columns. These can arise if the user sets a generated column to the literal DEFAULT--as allowed by the ANSI/ISO syntax. So for instance, the following is legal:
insert into T( refCol, generatedCol ) values ( 1, default )
We prune out the explicitly added generated columns because, later on in the bind() phase, the insert list is expanded to include all columns with defaults (not just generated columns).
ii) When the insert list is expanded to include all defaulted columns, we add in the generated columns but we don't bind their expressions. This is because the generation clause may refer to other columns in the base row. This, in turn, creates an ordering problem. In addition we we don't yet have a result set number for the base row--we need that number in order to bind references to other columns which may appear in the generation clauses.
iii) Later on, just before we parse and bind the CHECK constraints, we parse and bind the generation clauses. At this point, we have enough context to bind the referenced columns.
B) At generate() time, we generate method (2') in between generating (1') and (3'). The generated (2') method is now one of the arguments to the factory method which creates the execution-driver, the InsertResultSet. This is just like what we do for CHECK constraints: the generated (3') method is also an argument to the instantiation of the InsertResultSet.
C) At execution time, we evaluate (2') just before we evaluate (3').
Touches the following files:
Adds a method so that a ResultColumn can report whether it represents a generated column. I also forced all overrides of the expression field to go through the setExpression() method. This, technically speaking, is not necessary--but it made debugging easier for me and I think it will be useful for other developers who need to debug this node.
Changes are made to support both binding and code-generation. These are the bind() changes:
i) Adds a method to object if the user tries to override the value in a generated column with any value other than the DEFAULT literal. For instance, the following is illegal:
insert into T( refCol, generatedCol ) values ( 1, 70 )
In addition, we remove explicit mentions of generated columns because we will add them back when we enhance the INSERT statement with defaulted columns.
ii) Adds logic to parse and bind generated columns. This is modelled on the logic which parses and binds CHECK constraints.
iii) Renames bindCheckConstraint() to bindRowScopedExpression() because this method is now shared by the logic which binds CHECK constraints and the logic which binds generation clauses.
Short-circuits the logic which enhances the base row with defaulted columns. Adds in the generated columns but does not add their generation clauses. This is because the clauses cannot be bound at the same time as the rest of the columns in the base row. We wait to bind them until the time that we bind CHECK constraints.
Wires binding and code-generation calls into bindStatement() and generate().
– CODE GENERATION
Skips code-generation for generated columns when walking the base row. The generateCore() method generates (1'). We need to build the generation clauses into (2') instead and this is done later on.
In addition to the bind() changes described above, adds logic to generate the (2') method.
Adds (2') as an argument to the factory method which instantiates InsertResultSets.
Adds a method for retrieving the current row from the Activation. This allows us to return the Activation to its original state after we have run (2').
Evaluates generation clauses close to where CHECK constraints are evaluated.
Uncomments basic INSERT tests.