DERBY-781: Subquery Materialization (via Hash Join)
Outline of Changes


Note to readers: Code snippets and/or diffs in this document have been edited/trimmed down for the sake of discussion; the actual changes as I have them in my codeline include more comments and may have calls to utility methods that are not be described here.

I) Issue -- Current Behavior -- High Level

When executing a query that has complex subqueries in the FROM clause, Derby does not currently materialize the subqueries.  As a result, queries that perform joins for which the inner result set is a subquery will actually re-execute the subquery for *every* row in the outer result set, which can be painfully slow if the subquery is complex and/or if the subquery in turn performs joins of its own with still other subqueries.  Some sample databases show that this lack of materialization for complex subqueries can lead to execution times of minutes, hours, and even days for some relatively small datasets (a couple thousand rows) if the subqueries are even modestly nested (three or four levels).

II) Solution -- Desired Behavior -- High Level

If the optimizer were able to look at subqueries and to choose to materialize them when it's beneficial to do so, then at execution time we would only have to execute the subqueries a single time per statement, after which Derby could just perform the desired join against the materialized result set.

One can currently achieve a similar effect by creating a temp table for the subquery and then joining with the temp table--depending on the complexity of the subquery used to create the temp table, the performance benefits from this can be quite significant.  But that said, the use of temp tables is not very "user friendly" as it requires the end-user to manually analyze his/her queries and break them down into logical subqueries, which can be a non-trivial exercise depending on the complexity of the query and joins in question.  Further, the user would then have to create all of the required temp tables *every* time s/he wanted to execute the query, which can be expensive--and while it might still be better than having no materialization at all, it's still not an ideal solution.

The ideal solution, then, is to have the optimizer look at a query and figure out whether or not materialization could be beneficial and, if so, the optimizer should pick a query plan that allows such materialization to occur.  Not only does this relieve the user of the task of query analysis/rewrite using temp tables, but it also allows Derby's existing statement caching mechanism to take effect, meaning that after the initial compilation, the user can repeatedly execute the query thereafter without incurring the cost of creating temp tables every single time.

All of that said, Derby _does_ currently have a way of materializing result sets, as found in
Jeff Lichtman's comment on DERBY-781:
"If you think about it, we already have a join strategy that materializes the inner result  set, i.e. hash join.  I would expect the optimizer to at least consider this strategy for the example given in this enhancement request."

More specifically, if the optimizer chooses to do a hash join between two result sets (aka "Optimizables", or FROM tables) then Derby will, as part of the hash join processing, materialize the right result set.  But that said, the optimizer currently does *not* consider hash joins between subqueries.  So the solution I describe here is to update the optimizer so that, in addition to the plans it already considers, it also checks to see if it's possible to do a hash join in cases where the right result set is a subquery.  Then if the hash join is cheaper than a nested loop join, the optimizer will choose the hash join and therefore materialize the right result set--i.e. the subquery.  Thus by allowing the optimizer to consider hash joins with subqueries, we inherit the hash join's existing materialization mechanism for "free".

NOTE: Future work might include changes that let Derby decide to materialize result sets even in cases where hash joins are not possible/chosen, but that is not the goal of the changes described here.
III) Background -- Current Behavior -- In the Code

As part of the compilation phase for a query, the Derby Optimizer looks at all "optimizable" result sets and tries to determine what the best "access path" for each of those result sets is.  An "Optimizable" result set is defined in Derby as any type of result set that could potentially appear within the FROM clause of a SELECT query. If the Optimizable is a base table, then its "access path" determines which (if any) index to use and also indicates which join strategy is most appropriate for that table.  Note that the combination of index choice and join strategy determines which scan type (table, index, or hash) to use for that table.  If the Optimizable is not a base table (for example, if it's a subquery) then the access path consists of the combined access paths from the Optimizable's children, plus the best join strategy for that Optimizable.

For a given FROM list, a "join order" is a specific ordering of the Optimizables in the FROM list.  The optimizer's goal is to find the best join order for the FROM list, where "best join order" is a complete join order (meaning that every Optimizable in the FROM list has an assigned position) such that the total estimated cost of the join order, which is based on the estimated costs of each Optimizable at its current position, is less than any other complete join order found so far.  When placing an Optimizable at a position in the join order, the optimizer first pushes predicates (if any) to the Optimizable where possible, then iterates through the "decorated permutations" for that Optimizable and figures out which decorated permutation yields the lowest cost.  This "decorated permutation" translates into the "access path" for the Optimizable, so as mentioned above, it is composed of index choice plus join strategy for base tables, and is simply the join strategy for non-base tables.

A subquery, which is represented by a ProjectRestrictNode (which is an instance of Optimizable) having a child that is not an Optimizable (namely, the child is a SelectNode), is a non-base table and thus the choices for its "decorated permutations" are simply the choices for join strategy.

That said, there are currently two primary join strategies in Derby: Hash and Nested Loop.  So for non-base table Optimizables in a FROM list, the optimizer can first estimate the cost of that Optimizable (at its current position) using a nested loop join, then it can estimate the cost using a hash join, and finally, it can pick whichever join strategy is best.

The code to perform all of this processing starts in SelectNode.optimize() with the following block:

        /* Optimize this SelectNode */
        while (optimizer.getNextPermutation())
        {
            while (optimizer.getNextDecoratedPermutation())
            {
                optimizer.costPermutation();
            }
        }

By the time we get here, optimizer has been initialized such that its OptimizableList holds an Optimizable for each item in the FROM clause of the SELECT query.  The call to "getNextPermutation()" tells the optimizer to place the next Optimizable at a position in the current join order.  Then the call to "getNextDecoratedPermutation()" tells the optimizer to find out what the next available "decorated permutation" is for the newly-placed Optimizable.  As mentioned above, in the case of subqueries the decorated permutations are simply the available join strategies--i.e. hash join and nested loop join.  Then the call to "costPermtuation" tells the optimizer to figure out how much it costs to have the Optimizable at it's current position with its current decorated permutation.

When we run out of decorated permutations the call to getNextDecoratedPermutation() will ensure that the best permutation is saved and then it will return "false", breaking back to the outer loop and thus indicating that the optimizer should now place the next Optimizable in the join order.  When all possible join orders have been tried (every Optimizable has been placed at every possible position such that all possible orderings have been evaluated) or else when the optimizer "times out" (whichever comes first), the outer loop will terminate, thus concluding the optimization phase of the SELECT node.

IV) Issue -- Current Behavior -- In the Code

We focus our attention now the case of a subquery.  To repeat from above, a subquery is represented by a ProjectRestrictNode (which is an instance of Optimizable) that has a child that is a SelectNode--and a SelectNode is *not* an instance of an Optimizable.

Now assume that we've already made the call to optimizer.getNextPermutation(), so our subquery (ProjectRestrictNode) has been placed at some position in the join order.  We then make the call to optimizer.getNextDecoratedPermutation() and since we're dealing with a non-base table (subqueries are not base tables) we end up getting the first of the two available join strategies: nested loop join.  We then call "costPermutation" on the optimizer, which brings us to OptimizerImpl.costPermutation().  That method retrieves the Optimizable in question (i.e. the PRN for the subquery) and then makes the following call:

        /*
        ** Don't consider non-feasible join strategies.
        */
        if ( ! optimizable.feasibleJoinStrategy(predicateList, this))
        {
            return;
        }

Since our Optimizable is a ProjectRestrictNode, this brings us to the "feasibleJoinStrategy()" method as defined in ProjectRestrictNode:

        /* The child being an Optimizable is a special case.  In that
         * case, we want to get the current access path and join strategy
         * from the child.  Otherwise, we want to get it from this node.
         */
        if (childResult instanceof Optimizable)
        {
            ...
        }
        else
        {
            return super.feasibleJoinStrategy(restrictionList, optimizer);
        }

Since the childResult is an instance of SelectNode, which is not an Optimizable, we take the "else" branch.  Skipping some of the calls for the sake of simplicity, we will ultimately end up at the "feasible" method of the current join strategy--either HashJoinStrategy or NestedLoopJoinStrategy.  In either case we will end making a call to "isMaterializable()" on the ProjectRestrictNode, and in that method we find the key piece of code:

    /** @see Optimizable#isMaterializable
     *
     * @exception StandardException        Thrown on error
     */
    public boolean isMaterializable()
        throws StandardException
    {
        /* RESOLVE - Disallow arbitrary hash joins on
         * SELECTS within a derived table for now.
         * Remove this method once that restriction is removed.
         */
        if (! (childResult instanceof Optimizable))
        {
            return false;
        }

        return super.isMaterializable();
    }

Recalling (again) that a SelectNode is not an Optimizable, we can see that this method will always return false for subqueries.  In the case of NestedLoopJoinStrategy, receiving "false" here will lead to a call to "supportsMultipleInstantiations" on the ProjectRestrictNode, and that method (through inheritance) will end up returning "true"--and thus the NestedLoop join is considered "feasible".

So then, since nested loop join is in fact feasible, the optimizer.costPermutation() method will go on to estimate the cost of the subquery and will save that as the best cost so far.  Then we will end up back in SelectNode.optimize() for another call to getNextDecoratedPermutation(), which will now return the second (and final) join strategy that is available: hash join.  Following the same set of calls that we just outlined, we'll again end up at ProjectRestrictNode.isMaterializable(), which will again return false.  Unlike NestedLoop join strategy, though, HashJoinStrategy.feasible() immediately returns false if the Optimizable is not materializable:

        if (! innerTable.isMaterializable())
        {
            optimizer.trace(Optimizer.HJ_SKIP_NOT_MATERIALIZABLE, 0, 0, 0.0,
                            null);
            return false;
        }

This "false" gets propagated back to the costPermutation() method of OptimizerImpl and ultimately makes it so that *the hash join for the subquery is always considered IN-feasible, which means the optimizer won't go any further--it will give up on the hash join altogether, and in doing so will give up on any attempts at materializing the subquery*.

And so it is that the current Derby optimizer will never consider doing a hash join with a subquery, which in turn means that it will never materialize the subquery result set, and thus we will see the (sometimes massive) performance hit described in section I above.

V) Solution -- Desired Behavior -- In the Code

In an email to derby-dev regarding hash joins for subqueries, Jeff Lichtman wrote: "Unfortunately, I don't remember why hash joins are prohibited in this case. [...]
The only thing I can suggest is to try removing the restriction and see what happens."

  -- see http://article.gmane.org/gmane.comp.apache.db.derby.devel/12208

Jeff's advice coincides nicely with the comments in the existing code shown above: namely, in the "isMaterializable()" method of ProjectRestrictNode is the following comment:

        /* RESOLVE - Disallow arbitrary hash joins on
         * SELECTS within a derived table for now.
         * Remove this method once that restriction is removed.
         */

Per Jeff's suggestion, what I'm doing for the DERBY-781 changes is removing this restriction, and the first step toward that goal is to remove the "isMaterializable()" method from ProjectRestrictNode altogether.  When we do so, the call will now end up at FromTable.isMaterializable(), which does some checking for correlated column references and returns true/false based on that check.  Thus in the case of subqueries, it is now possible for isMaterializable() to return "true", which means that hash joins are now *potentially* feasible with subqueries.

That is not, however, the whole of the fix.  Removing the isMaterializable() method from ProjectRestrictNode makes it so that we won't immediately return false from HashJoinStrategy.feasible().  But there's still more work to be done: namely, we have to figure out if there are any equijoin predicates available that can be used for the hash join.  We again get some direction by reading the aforementioned email from Jeff:

<begin_quote>

One thing I notice in looking at the code is the following in HashJoinStrategy.feasible():


        /* Look for equijoins in the predicate list */
        hashKeyColumns = findHashKeyColumns(
                                             innerTable,
                                             cd,
                                             predList);

I don't know what the predicate list looks like at this point for the query in question. Perhaps there's something about it that makes it hard to find equijoins that should make up the hash key columns.

<end_quote>

And Jeff is absolutely right.  If we stopped our work now (and just removed the hash join restriction from ProjectRestrictNode), the call to findHashkeyColumns() would not find any applicable equijoin predicates and thus would not find any hash keys.  As a result, hashKeyColumns would end up null and the feasible() method would yet again return false, once again rendering the hash join with subquery IN-feasible.

So in order to fully enable hash joins with subqueries, we have to figure out how to find equijoins for hash key columns when the Optimizable in question is a subquery.  In simple terms the problem is that, when checking for applicable equijoin predicates, the optimizer currently looks at the column references on either side of the join predicate and sees if either column reference has the same table number as the Optimizable.  Skipping down to the heart of the matter, the code where this check is made is in the "getColumnOperand()" method of BinaryRelationalOperatorNode, where for each column reference (left and right) of the predicate we do:

            if (cr.getTableNumber() == tableNumber)
            {
                /*
                ** The table is correct, how about the column position?
                */
                if (cr.getSource().getColumnPosition() == columnPosition)
                {
                    /* We've found the correct column - return it */
                    return cr;
                }
            }

Note: tableNumber here is the tableNumber of the Optimizable, which is passed into this method as a parameter.

If the Optimizable is a base table then this simple check is sufficient because there's only one tableNumber involved and that is the table number for the Optimizable.  But if the Optimizable is a ProjectRestrictNode representing a subquery, there could be an arbitrary number of base tables that are further down in the subquery (beneath the ProjectRestrictNode), and if the columns from any of those base tables are projected out, then those columns could be the basis for the equijoin that we need.  However, the above code will simply compare the ProjectRestrictNode's *own* table number to the column reference's table number--so if the column reference is actually pointing to one of the tables in the subquery beneath the PRN, the comparison will fail and we won't find our equijoin predicate.

Perhaps now is a good time for an example.  Suppose the query we have is something like the following:

  select v1.i, t3.b
    from
      t3,
      (select distinct i, j from t1) v1
    where t3.a = v1.i;

Each of the relevant FROM tables will be assigned a tableNumber before optimization; for this example, let's assume T3 has tableNumber 0, V1 has tableNumber 1, and T1 as tableNumber 2.  In this case there will be a ProjectRestrictNode above the subquery V1, so it's actually the ProjectRestrictNode that will have tableNumber 1; then the FromBaseTable for T3 will have tableNumber 0 and the FromBaseTable for T1 will have tableNumber 2.

Now let's assume that the join order we're looking at is {T3, V1} and that we're checking to see if a hash join is feasible.  We'll take the predicate "t3.a = v1.i" and, for each side of the predicate, we'll compare the column reference's source table number to the PRN's tableNumber.  Starting with the left operand, we'll see that the source is T3 and the tableNumber is 0; we'll then compare 0 with 1, and since they're not the same we'll move on to the right operand.  At this point, the correct thing to do is to realize that the operand "v1.i" is pointing to a table that is in the subquery beneath the PRN, and thus that this predicate *is* a valid equijoin predicate for V1.  However, the source for "v1.i" is T1 and thus its tableNumber is 2, which we then compare with the PRN's tableNumber--and since 1 != 2, we'll fail to find the correct operand.  The end result, then, is that we fail to find an equijoin predicate that is applicable to V1 and therefore the hash join is deemed infeasible.

So the second--and more core--part of the DERBY-781 changes is to make the code in BinaryRelationalOperatorNode aware of base tables that are "buried" in subqueries so that it can correctly determine if a given predicate is applicable to the subquery in question.  If we can do that, then the hash join becomes feasible, which allows the optimizer to calculate the cost, and if the cost is lower than for nested loop join, the optimizer can choose hash join as the "best access path" for the subquery.  And that, in turn, means that the subquery can be materialized once during execution and then probed for each outer row, which can make for a major performance improvement for complex subqueries.

VI) Solution Details -- In the Code

In a word, the solution for the findHashKey() problem described in the previous section is to allow the getColumnOperand() method in BinaryRelationalOperatorNode to search a subquery for table numbers and to then use those table numbers when trying to find a match for a specific column operand.  This same thing also needs to be done for the many other methods in BinaryRelationalOperatorNode that rely on a table-number-matching check similar to the one described above.

More specifically, what we want to do for these checks is as follows.  Start with the left operand and, if it's a column reference, do the following:
  1. Walk the subtree beneath the Optimizable (if there is a subtree) and find the table numbers for all base tables in the subtree.  This includes the base table number for the Optimizable itself, if it's non-negative.
  2. Find the _base_ table number for the column reference.  Note that this could potentially be different from the number returned from columnReference.getTableNumber() because it's possible that the column reference's source is not a base table itself, but is some FromTable that is sitting between the column reference and the base table that it (the column reference) ultimately references.  In that case the column reference's tableNumber will be the tableNumber of the intermediate FromTable, not that of the actual base table.  When searching for an equijoin, though, we want to figure out if the *base* table (and column) that is ultimately referenced by the column reference matches any of the projected *base* tables (and columns) from the subtree--if so, then we have a viable equijoin predicate.
  3. See if the the list of table numbers from step "i" includes the table number from step ii; if so, then consider this a match.
If we still do not have a match, or if the left operand is not a column reference, then try the same thing with the right operand (if the right operand is a column reference).  If after that we still do not have a match, then the BinaryRelationalOperator in question cannot be used as an equijoin for the hash.

Revisiting the example given in section V above, let's again assume that we're trying to find out if a hash join is feasible with V1 as the inner table.  Our BinaryRelationalOperatorNode corresponds to "
t3.a = v1.i", T3's tableNumber is 0, the PRN over the subquery has tableNumber 1, and T1's tableNumber is 2.  Then we have:

<start with left operand>

 Step i: The Optimizable (PRN) has table number 1, and if we walk the subtree of the PRN we will also find table number 2 (for T1).
 Step ii: The base table number for "t3.a" is the tableNumber for t3, which is 0.
 Step iii: The set {1, 2} does not contain 0, so we do not have a match.

<try with right operand>

 Step i: The PRN has table number 1 and if we walk its subtree we find tableNumber 2 (for T1).
 Step ii: The base table number for "v1.i" is the tableNumber for t1, which is 2.
 Step iii: The set {1, 2}
does include 2, so we have an equijoin predicate that applies to v1.

To apply these steps in the actual code, I updated the public "getColumnOperand()" methods in BinaryRelationalOperatorNode by replacing the old comparison of tableNumbers with a call to a new method, "valNodeReferencesOptTable()", that performs the three steps outlined above in order to determine if the passed ValueNode (esp. column reference) references a base table that is at or below (i.e. in the subtree beneath) the passed FromTable:

-            if (cr.getTableNumber() == tableNumber)
+            if (valNodeReferencesOptTable(cr, ft, false, walkSubtree))

Note that as part of this change I did some refactoring of the methods in BinaryRelationalOperator.  In particular, I removed the private version of getColumnOperand() that took a table number and a column position (instead of an Optimizable and a column position).  That version of the method was only used for a single case, which was in PredicateList.java as part of the "constantColumn" method, and it did the same thing as getExpressionOperand() in the same class with one difference: getExpressionOperand(int, int) searched for a matching operand and then returned the *other* side while getColumnOperand(int, int) searched for a matching operand and returned that same operand.  So I combined the two into a single new method, getOperand(), that takes a ColumnReference and a boolean value.  The getOperand() method does a more extensive search for the target column reference (by looking at _base_ table numbers) and then returns either the matching operand or the "other" side, depending on the boolean parameter.  And of course, I made the corresponding changes to RelationalOperator and UnaryComparisonOperator, as well.

That done, it turns out there are several other places in the BinaryRelationalOperatorNode where we're trying to find a column operand (esp. column reference) that corresponds to a target FromTable, and in order for a hash join with a subquery to be successful throughout optimization, code generation, and query execution, all of those other places need to be updated in the same way--i.e. the simple comparison of tableNumbers has to be replaced with a call to the new valNodeReferencesOptTable() method to walk the FromTable's subtree.  In particular, the following methods all required this change (or something very similar):

  - getColumnOperand()
  - getExpressionOperand()
  - keyColumnOnLeft()
  - columnOnOneSide()
  - isQualifier()

Note that, of these methods, getExpressionOperand() was the only one that didn't currently take the target Optimizable as a parameter (it just took a tableNumber), so I had to modify that method's signature to take an Optimizable and pass it on to valNodeReferencesOptTable().

As for valNodeReferencesOptTable(), it simply performs steps i, ii, and iii as described above:

+    private boolean valNodeReferencesOptTable(ValueNode valNode,
+        FromTable optTable, boolean forPush, boolean walkOptTableSubtree)
+    {
+        // Following call will initialize/reset the btnVis,
+        // valNodeBaseTables, and optBaseTables fields of this object.
+        initBaseTableVisitor(optTable.getReferencedTableMap().size(),
+            walkOptTableSubtree);
 
+        boolean found = false;
+        try {
 
+            // Find all base tables beneath optTable and load them
+            // into this object's optBaseTables map.  This is the
+            // list of table numbers we'll search to see if the
+            // value node references any tables in the subtree at
+            // or beneath optTable.
+            if (walkOptTableSubtree)
+                buildTableNumList(optTable, forPush);
+
+            // Now get the base table numbers that are in valNode's
+            // subtree.  In most cases valNode will be a ColumnReference
+            // and this will return a single base table number.
+            btnVis.setTableMap(valNodeBaseTables);
+            valNode.accept(btnVis);
+
+            // And finally, see if there's anything in common.
+            valNodeBaseTables.and(optBaseTables);
+            found = (valNodeBaseTables.getFirstSetBit() != -1);
+
+        } catch (StandardException se) {
+            if (SanityManager.DEBUG)
+            {
+                SanityManager.THROWASSERT("Failed when trying to " +
+                    "find base table numbers for reference check:\n" +
+                    se.getMessage());
+            }
+        }
+
+        return found;
+    }

There are several things to notice about this method, the first of which is the "forPush" parameter.  As mentioned above, one of the methods in BinaryRelationalOperatorNode that had to be modified to accommodate hash joins with arbitrary subqueries was "isQualifier()".  This is significant because, based on inspection of the code (esp. PredicateList.java), I've noticed that that the "isQualifier()" method is actually used for two different (but related) purposes:  1) see if this relational operator (or more specifically, the predicate to which this operator belongs) can be used as a join predicate (esp. for a hash join), and 2) see if this operator can be pushed to the target optTable.  The second use scenario only ever occurs during the "modify access paths" phase of compilation, which occurs after optimization has completed.  I use the parameter "forPush" to distinguish between the two uses because in some cases (esp. situations where we have subqueries) the answer to "is this a qualifier?" can differ depending on whether or not we're pushing.  In particular, for binary ops that are join predicates, if we're simply trying to find an equijoin predicate then we say that the operator is a qualifier if exactly one of its operands references either the target table OR any of the base tables in the target table's subtree.  But if we're planning to _push_ the predicate down to the target table, we only say that the operator qualifies if exactly one of its operands reference the target table *directly* (which is the existing behavior--i.e. prior to the changes described in this document).

This difference in behavior is required because in case 1 (searching for join predicates), the operator remains at its current level in the tree even if its operands reference nodes further down; in case 2, though, we'll end up pushing the operator down the tree to child node(s), which requires additional logic--and especially "scoping" consideration.  (For more on the kinds of work involved in "scoping" a predicate, see the HTML document attached to DERBY-805.)  Until scoping requirements/considerations have been addressed for this stage of compilation, I've added logic to BinaryRelationalOperatorNode to ensure that, when trying to determine if a predicate is a qualifier for some FromTable, Derby will only search the subtree if the intent is *not* to push this operator further down that subtree.

Another thing to note here is that the three-step algorithm is written to work with any FromTable--as opposed to just working with a ProjectRestrictNode whose child is a SelectNode.  There are three reasons for doing it this way.  The first is that the changes to BinaryRelationalOperatorNode--and in particular, to isQualifier()--can be called as part of the "modify access paths" phase of compilation, by which time a SelectNode will have been replaced with an appropriate instance of FromTable--ex. another PRN, or a DistinctNode, etc.  The behavior of the new code must be consistent between the "optimize" and "modifying access path" phases of compilation, and thus it should work regardless of whether or not a subquery has been transformed into a different (but equivalent) node or set of nodes in the query tree.  The second reason is that many of the five methods listed above (getColumnOperand(), getExpressionOperand(), keyColumnOnLeft(), columnOnOneSide(), and isQualifier()) are called during optimization for purposes outside of checking the feasibility of hash joins--but as it turns out, the ability to search a subtree for matching table numbers can prove beneficial in those other scenarios, as well.  And the third reason is that the logic we add must still work when the optimizer is considering hash joins between other, non-subquery FromTables (i.e. we don't want to break existing hash join optimizations).  For these three reasons the new code shown above and below is designed to work on FromTables in general instead of specific subquery nodes like PRNs-over-SelectNodes.  But this decision does not come for free: see section VII below for a description of some additional changes that are required as a result of this particular implementation.

And finally, we note the call to the buildTableNumList(), which is defined as follows:

+    /**
+     * Create a set of table numbers to search when trying to find
+     * which (if either) of this operator's operands reference the
+     * received target table.  At the minimum this set should contain
+     * the target table's own table number.  After that, if we're
+     * _not_ attempting to push this operator (or more specifically,
+     * the predicate to which this operator belongs) to the target
+     * table, we go on to search the subtree beneath the target
+     * table and add any base table numbers to the searchable list.
+     *
+     * @param ft Target table for which we're building the search
+     *  list.
+     * @param forPush Whether or not we are searching with the intent
+     *  to push this operator to the target table.
+     */
+    private void buildTableNumList(FromTable ft, boolean forPush)
+        throws StandardException
+    {
+        // Start with the target table's own table number.  Note
+        // that if ft is an instanceof SingleChildResultSetNode, its
+        // table number could be negative.
+        if (ft.getTableNumber() >= 0)
+            optBaseTables.set(ft.getTableNumber());
+
+        if (forPush)
+        // nothing else to do.
+            return;
+
+        // Add any table numbers from the target table's
+        // reference map.
+        optBaseTables.or(ft.getReferencedTableMap());
+
+        // The table's reference map is not guaranteed to have
+        // all of the tables that are actually used--for example,
+        // if the table is a ProjectRestrictNode or a JoinNode
+        // with a subquery as a child, the ref map will contain
+        // the number for the PRN above the subuquery, but it
+        // won't contain the table numbers referenced by the
+        // subquery.  So here we go through and find ALL base
+        // table numbers beneath the target node.
+        btnVis.setTableMap(optBaseTables);
+        ft.accept(btnVis);
+        return;
+    }

Okay, so with that we've now completed the primary changes required to enable hash joins with subqueries, and thus we have (implicit) subquery materialization.  There are, however, two final modifications to the code that are required before things start working as expected.  We look at those next...

VII) Additional Required Changes

Once the changes described in section VI have been implemented, Derby will consider using hash joins with subqueries and, if it's feasible to do so, the optimizer will estimate the cost of the hash join, compare it with the cost of a nested loop join, and then choose to do the hash if that's the cheapest option.  All of that is good--we're getting closer to where we want to be.  But there are still three other areas in the code to look at: first, we have to make sure that, during the "modify access paths" phase of compilation, Derby will generate the correct plan for the hash join.  Second, we have to add some additional checks to ensure that Derby doesn't incorrectly choose to do an "unsafe" hash join.  And third, I made a slight change to the optimizer to give it better chance of finding the "best" join order when doing a hash join with a subquery.

A. Generating a Hash Join plan with a Subquery

After applying the changes described in section VI above, Derby might decide that it is in fact cheapest to do a hash join with some subquery.  However, when it comes time to generate the corresponding plan--which is done as part of the "modify access paths" phase of compilation--the existing code might fail for a couple of reasons.

As mentioned earlier, a subquery in Derby will have a ProjectRestrictNode sitting above it throughout optimization.  This means that when it comes time to generate the query plan for a subquery, Derby will start in the modifyAccessPath() method of the ProjectRestrictNode (PRN) that sits above the subquery.  In this method the PRN does the following:
  1. Make a call to generate the query plan for the subquery
  2. Try to push predicates to the subquery
  3. If the optimizer chose a hash join, the PRN will make a call to generate a HashTableNode for the subquery node--and that HashTableNode translates into a materialized result set at execution time.
Step two is where we run into the first problem:

        if ((restrictionList != null) && !alreadyPushed)
        {
            restrictionList.pushUsefulPredicates((Optimizable) childResult);
        }

In cases where we have a PRN over a subquery and the optimizer chooses to do a hash join, the above "if" condition will evaluate to true, which means that we'll try to push predicates down into the subquery if we think they could be useful there.  The problem is that, while a predicate might be "useful" for the subquery for a nested loop join, the optimizer ultimately chose to do a hash join, instead--and that means that we do *not* want to push any of the predicates down.  The following change, which solves this particular problem, includes comments explaining why we need to skip pushing predicates for a hash join:

+        // If we're doing a hash join with _this_ PRN (as opposed to
+        // with this PRN's child) then we don't attempt to push
+        // predicates down.  There are two reasons for this: 1)
+        // we don't want to push the equijoin predicate that is
+        // required for the hash join, and 2) if we're doing a
+        // hash join then we're going to materialize the child node,
+        // but if we push predicates before materialization, we
+        // can end up with incorrect results (esp. missing rows).
+        // So don't push anything in this case.
+        boolean hashJoinWithThisPRN = hasTrulyTheBestAccessPath &&
+            (trulyTheBestAccessPath.getJoinStrategy() != null) &&
+            trulyTheBestAccessPath.getJoinStrategy().isHashJoin();

-        if ((restrictionList != null) && !alreadyPushed)
+        if ((restrictionList != null) && !alreadyPushed && !hashJoinWithThisPRN)
         {
             restrictionList.pushUsefulPredicates((Optimizable) childResult);
         }

Note that "hasTrulyTheBestAccessPath" is an object variable that will be true if the PRN has a child that is not an Optimizable--and a SelectNode subquery is just such a child.  Note also that this change does not affect existing hash joins because, prior to the changes described in this document, Derby only ever did hash joins with a PRN if it had a child that was an Optimizable, and in that case hasTrulyTheBestAccessPath will always be false (so hashJoinWithThisPRN will be false and we'll push predicates as usual).

Step three, then, is carried out by the following call:

        /*
        ** Replace this PRN with a HTN if a hash join
        ** is being done at this node.  (Hash join on a scan
        ** is a special case and is handled at the FBT.)
        */
        if (trulyTheBestAccessPath.getJoinStrategy() != null &&
            trulyTheBestAccessPath.getJoinStrategy().isHashJoin())
        {
            return replaceWithHashTableNode();
        }

The definition of replaceWithHashTableNode() in the current code is pretty well commented so I won't go into details about the code that's already there.  I did, however, have to add a small amount of logic to that method to account for hash joins with subqueries.  In particular, I added the following code to pass some information down from the ProjectRestrictNode to its child so that, when it comes time to find and generate the equijoin predicate for the hash join, the child result knows what access path and table number it's supposed to use.  Note that by the time we get here, the subquery node (ex. SelectNode) will now have been converted into a corresponding SingleChildResultSet node (ex. another ProjectRestrictNode or a DistinctNode).  This conversion happens as part of the call to "modify access paths" on the subquery child.  That said, I've now added the following logic to replaceWithHashTableNode():

     private Optimizable replaceWithHashTableNode()
         throws StandardException
     {
+        // If this PRN has TTB access path for its child, store that access
+        // path in the child here, so that we can find it later when it
+        // comes time to generate qualifiers for the hash predicates (we
+        // need the child's access path when generating qualifiers; if we
+        // don't pass the path down here, the child won't be able to find
+        // it).
+        if (hasTrulyTheBestAccessPath)
+        {
+            ((FromTable)childResult).trulyTheBestAccessPath =
+                (AccessPathImpl)getTrulyTheBestAccessPath();
+
+            // If the child itself is another SingleChildResultSetNode
+            // (which is also what a ProjectRestrictNode is), then tell
+            // it that it is now holding TTB path for it's own child.  Again,
+            // this info is needed so that child knows where to find the
+            // access path at generation time.
+            if (childResult instanceof SingleChildResultSetNode)
+            {
+                ((SingleChildResultSetNode)childResult)
+                    .hasTrulyTheBestAccessPath = hasTrulyTheBestAccessPath;
+
+                // While we're at it, add the PRN's table number to the
+                // child's referenced map so that we can find the equijoin
+                // predicate.  We have to do this because the predicate
+                // will be referencing the PRN's tableNumber, not the
+                // child's--and since we use the child as the target
+                // when searching for hash keys (as can be seen in
+                // HashJoinStrategy.divideUpPredicateLists()), the child
+                // should know what this PRN's table number is.  This
+                // is somewhat bizarre since the child doesn't
+                // actually "reference" this PRN, but since the child's
+                // reference map is used when searching for the equijoin
+                // predicate (see "buildTableNumList" in
+                // BinaryRelationalOperatorNode), this is the simplest
+                // way to pass this PRN's table number down...
+                childResult.getReferencedTableMap().set(tableNumber);
+            }
+        }

With these changes in place, we should now correctly generate the query plan for a hash join that has a subquery as its inner result set.  All that's left, then, is to address the issue of potentially unsafe hash joins.

B. Unsafe Hash Joins

Let's go back to the HashJoinStrategy.feasible() method that I mentioned briefly in sections IV and V above.  In those sections I described some simple changes that allow the call to "innerTable.isMaterializable()" to potentially return true even if innerTable represents a subquery
(which wasn't possible prior to DERBY-781).  The feasible() method then calls "findHashKeyColumns()" which, as described in section VI, will now search a subquery's subtree to find table numbers and thus the optimizer can find equijoin predicates that it otherwise would have missed.

But now we have another thing to consider: depending on the predicates in question, it's not always safe to do a hash join with a subquery--*even if* there exists one or more relevant equijoin predicates.  We can see a glimpse of the underlying issue here by looking at the FromTable.isMaterializable() method, where we see the following:

        /* Derived tables are materializable
         * iff they are not correlated with an outer query block.
         */
        HasCorrelatedCRsVisitor visitor = new HasCorrelatedCRsVisitor();
        accept(visitor);
        return (!(visitor.hasCorrelatedCRs()));

In other words, a "derived table" (which is what a subquery is) is not materializable--and therefore is not eligible for a hash join--if it contains column references that are correlated with an outer query.

When considering a hash join with a subquery--or more generally, with a table that is a FromTable having a subtree beneath it--we want to take this one step further and say that the FromTable is not materializable (and therefore is not eligible for a hash join) if it is correlated with an outer query.  This correlation between the FromTable and an outer query is accomplished via join predicates and is identifiable by inspection of the predicates' referenced maps.  More specifically, if either operand of the predicate directly references any tables which are in the subtree beneath the FromTable, then the operands' sources have different nesting levels and thus the predicate is the basis of a correlation between the FromTable and the outer query.   This means that the FromTable (and its corresponding subtree) is not materializable and therefore we cannot safely perform a hash join. 

Let's take an example.  Consider the following two queries:

Q1.
select * from V1, V2 where V1.j = V2.b;
Q2. select 1 from tt, (t left outer join (s left outer join r on (f = i)) on (b = e)) where (j=g);

The query tree representation for Q1 is shown below.  I've included tableNumbers in square brackets and, for the predicates, I've included the referencedMaps in curly brackets.  A -1 in brackets means that the specified node was not assigned a table number (-1 is the default); an empty bracket means that the specified node doesn't have a tableNumber field (because it's not an Optimizable).  Table names such as "T1" are actually instances of FromBaseTable, but I've just used the table names themselves.

Q1. select * from V1, V2 where V1.j = V2.b;

  SelectNode [] (V1.j = V2.b) {0, 3}
           |     
< PRN [0]   ,               PRN [3] >
     |                         |
SelectNode[]              SelectNode []
     |                         |
< PRN [-1] , PRN [-1] >   < PRN [-1] , PRN [-1] >
      |        |               |         |
   T1 [1]     T2 [2]         T3 [4]    T4 [5]


We can see here that the predicate (V1.j = V2.b) has {0,3} as it's referenced map.  Let "innerTable" be the subquery V2, which is represented by PRN[3].  If we walk the subtree beneath V2 we will see two other base table numbers: {4,5}.  Since the predicate's referenced map does not contain either of those base table numbers, we know that the subquery is not correlated, and thus it's safe to do a hash join with the subquery V2 (assuming we can find an equijoin predicate, as discussed in section VI).

Now lets look at the query tree representation for Q2:

Q2. select 1 from tt, (t left outer join (s left outer join r on (f = i)) on (b = e)) where (j=g);

  SelectNode [] (j = g) {0, 3}
           |     
< PRN [-1] , PRN [-1] >
  |            |
TT [0]   HalfOuterJoinNode [5]
               |
              / \ (b = e) {1,2}
       PRN [-1]  PRN [-1]
           |        |
         T [1]  JoinNode [4]
                    |
                   / \ (f = i) {2,3}
            PRN [-1]  PRN [-1]
                 |     |
               S [2]  R [3]

In this case the optimizer for the outer query is trying to figure out what the best join strategy between TT[0] and HalfOuterJoinNode[5] is.  The first thing to note here is that, as mentioned in section VI above, the logic we added for DERBY-781 must still work for existing hash join optimizations--that is, for cases where the inner table is a non-subquery FromTable (such as a HalfOuterJoinNode in this case).  That said, if we follow the three steps described in section VI for Q2, we'll come to the conclusion that a hash join with HalfOuterJoinNode[5] as the inner table is feasible because, by walking the subtree, we'll find table number "3" and deduce that (j = g) with reference map {0,3} is a valid equijoin.  That would, however, constitute an unsafe hash join because the predicate (j = g) creates a correlation between two nodes at different nesting levels--namely, TT[0] and R[3].  If we went ahead and performed the hash join, we would end up materializing HalfOuterJoinNode[5] and that, because of the correlation, would lead to incorrect results (esp. missing rows).

So how do we avoid this?  As mentioned above, we check the predicate's referenced map and see if either operand points directly to a node that is in the subtree beneath "innerTable".  In this case
the predicate's referenced map is {0,3} and innerTable is HalfOuterJoinNode[5].  If we walk the subtree beneath HalfOuterJoinNode[5], we will see three other base table numbers: {1,2,3}.  Then, since the intersection of {0,3} and {1,2,3} is not empty, we know that that predicate must be the basis of a correlation between two nodes at different nesting levels.  Therefore we have to explicitly disallow the hash join.

In terms of code, I check for this condition in HashJoinStrategy.feasible() and, if I find it, I disallow the hash join by returning "false"--i.e. by saying that the hash join is not feasible.  This check is performed before we attempt to find hash key columns.  The diff is as follows:

+        /* If the predicate given by the user _directly_ references
+         * any of the base tables _beneath_ this node, then we
+         * cannot safely use the predicate for a hash because the
+         * predicate correlates two nodes at different nesting levels.
+         * If we did a hash join in this case, materialization of
+         * innerTable could lead to incorrect results--and in particular,
+         * results that are missing rows.  We can check for this by
+         * looking at the predicates' reference maps, which are set based
+         * on the initial query (as part of pre-processing).  Note that
+         * by the time we get here, it's possible that a predicate's
+         * reference map holds table numbers that do not agree with the
+         * table numbers of the column references used by the predicate.
+         * That's okay--this occurs as a result of "remapping" predicates
+         * that have been pushed down the query tree.  And in fact
+         * it's a good thing because, by looking at the column reference's
+         * own table numbers instead of the predicate's referenced map,
+         * we are more readily able to find equijoin predicates that
+         * we otherwise would not have found.
+         *
+         * Note: do not perform this check if innerTable is a FromBaseTable
+         * because a base table does not have a "subtree" to speak of.
+         */
+        if ((predList != null) && (predList.size() > 0) &&
+            !(innerTable instanceof FromBaseTable))
+        {
+            FromTable ft = (FromTable)innerTable;
+
+            // First get a list of all of the base tables in the subtree
+            // below innerTable.
+            JBitSet tNums = new JBitSet(ft.getReferencedTableMap().size());
+            BaseTableNumbersVisitor btnVis = new BaseTableNumbersVisitor(tNums);
+            ft.accept(btnVis);
+
+            // Now get a list of all table numbers referenced by the
+            // join predicates that we'll be searching.
+            JBitSet pNums = new JBitSet(tNums.size());
+            Predicate pred = null;
+            for (int i = 0; i < predList.size(); i++)
+            {
+                pred = (Predicate)predList.getOptPredicate(i);
+                if (pred.isJoinPredicate())
+                    pNums.or(pred.getReferencedSet());
+            }
+
+            // If tNums and pNums have anything in common, then at
+            // least one predicate in the list refers directly to
+            // a base table beneath this node (as opposed to referring
+            // just to this node), which means it's not safe to do a
+            // hash join.
+            tNums.and(pNums);
+            if (tNums.getFirstSetBit() != -1)
+                return false;
+        }

If we go through the predicates and determine that they are all "safe" for materialization, then we will go on to search for hash key columns as described in section VI above.

C. Optimizer Change.

When running tests I noticed that, occasionally, the optimizer would choose to do a hash join with a subquery, but the join order for the subquery would be worse than other times.  At first I just assumed this was because of optimizer timeout--that for whatever reason (esp. other apps running my computer) the optimizer didn't get through as many join orders as "usual" before timeout and so didn't have a chance to find the better join order.  And in a way that's true.  However, I then noticed that the times when the optimizer chose the worse join order for the subquery, it was actually timing out _later_ than when it chose the better plan--i.e. the optimizer was seeing more join orders but was picking a worse plan.  Confused, I did some tracing and eventually tracked the odd behavior down to the following code in OptimizerImpl.getNextPermutation():

        if (timeExceeded && bestCost.isUninitialized())
        {
            /* We can get here if this OptimizerImpl is for a subquery
             * that timed out for a previous permutation of the outer
             * query, but then the outer query itself did _not_ timeout.
             ...
             * All of that said, instead of just trying the first possible
             * join order, we jump to the join order that gave us the best
             * cost in previous rounds. 
             */
             <code to jump to the previous best join order and get the cost>
        }

        <else, go through join orders as usual>

As described in the comment for the "if" block, in situations where the subquery has "timed out" before the current round of optimization has even started, we'll take an "educated guess" at what the best join order is by jumping to the join order that gave us the best cost in the previous round.  The interesting part, though, is that we will _not_ make this educated guess if we haven't timed out yet--and that's why I observed the "odd" behavior mentioned above.

To take a simplified and contrived example, assume we have the following query:

  select v1.i, t3.b
    from
      t3,
      (select distinct t1.i, t3.j from t1, t3, t5, t6) v1
    where t3.a = v1.i;

The two complete join orders that the outer optimizer will try are {t3, v1} and {v1, t3}.  This in turn means that there will be four rounds of optimization for the subquery: two for {t3, v1} and two for {v1, t3}, where each join order will optimize once for nested loop join and once for hash join.

Scenario 1:

Let's assume that the optimizer is evaluating the cost of {t3, v1}, and that the best join order for the first round--which will be for a nested loop join with v1 as the inner table--is {t1, t6, t5, t3} and the cost is 100 (the number doesn't mean anything, I'm just using it for illustration purposes).  Let's further assume that the subquery timed out during this first round of optimization, but the outer query does *not* timeout.  Then the outer query will make a call to optimize the subquery a second time, this time trying a hash join with v1 as the inner table.  At this point the above logic will kick in because the subquery already timed out, so the optimizer for the subquery will jump to join order {t1, t6, t5, t3} and evaluate the cost based on the current join strategy (hash join)--let's say the cost is 50 (i.e. the hash join with v1 is cheaper than the nested loop join).  At this point the subquery optimizer has a best join order and an associated cost (50) and so it will stop optimizing due to timeout, returning a cost of 50 to the outer query.  The outer query then times out and so uses the cost of 50 (which corresponds to a hash join with v1 as the inner table, join order {t1, t6, t5, t3}) as its "best path" and generates the hash.

Okay, so that's how we get the better plan.  But how is it that sometimes, by seeing _more_ join orders, the optimizer for the subquery can actually return a worse plan?  We'll walk through the situation again, but this time we won't timeout so soon:

Scenario 2:

Let's assume that the optimizer is evaluating the cost of {t3, v1}, and that the best join order for the first round--which will be for a nested loop join with v1 as the inner table--is {t1, t6, t5, t3} and the cost is 100.  But unlike in the previous scenario, the optimizer for the subquery does *not* timeout at this point.  Then the outer query will make a call to optimize the subquery a second time, this time using a hash join with v1 as the inner table.  At this point, since the subquery hasn't timed out yet, we'll follow the "else" branch in the above code, meaning that we'll just do a regular search for the best join order, starting at square one.  So let's say that the optimizer for the subquery tries {t1, t3, t5, t6} and gets a cost of 75, and _then_ it times out.  This means that the subquery optimizer saw one more join order before timing out.  Now, since the the subquery optimizer has a best join order and an associated cost (75), it will stop optimizing due to timeout, returning a cost of 75 to the outer query.  The outer query then compares 75 to 100 and since 75 is cheaper, it chooses to do a hash join with v1 as the inner table and join order {t1, t3, t5, t6}.

Thus, because the optimizer for the subquery saw one more join order in the second scenario than in the first, it ended up choosing a worse plan: cost 75 instead of 50.

To resolve this issue, I didn't have to "fix" the existing if-block outlined above--the code as it is correct.  Instead, I simply made it so that the optimizer for a subquery will make the "educated guess"--i.e. will jump to the previous best join order--even if that optimizer has not timed out yet, based on certain criteria.  That "criteria" is simply this:  if neither the previous round of optimization nor this round relies on predicates that have been pushed down from an outer query, then make an "educated guess" as to what the best join order for this round will be by jumping to the best join order found in the previous round (if there was one):

-        if (timeExceeded && bestCost.isUninitialized())
+        if (bestCost.isUninitialized() && foundABestPlan &&
+            ((!usingPredsPushedFromAbove && !bestJoinOrderUsedPredsFromAbove)
+                || timeExceeded))

The reason I've code this criteria into the if statement is that a subquery which has predicates pushed from an outer query could have a vastly different "best join order" than a subquery that doesn't have such predicates. On the other hand, if no predicates from outer queries are involved then the subquery's best join order is largely independent of the outer query, which means that the best join order is likely to be same from one round to the next, and thus it makes more sense to take an "educated guess" by starting with the best join order from a previous round.

With this change in place (along with the appropriate declaration and assignment of the new variables, of course), the optimizer for the subquery in scenario 2 will jump to join order {t1, t6, t5, t3} for the second round of optimization (the round where we're considering a hash join) and then calculate the cost, which will return 50 (as in scenario 1). The subquery optimizer might then go on to try other join orders before it times out, and it may even find a better plan--but at least we know that we'll generate a plan at least as good as the one we found in scenario 1, which is what we were shooting for.

And with that slight change to the OptimizerImpl class, I think I've described all of the major pieces of the patch that I've created to resolve for DERBY-781.
Whew.

VIII) Further Work.

The changes as described in this document (version 1) will allow the optimizer to consider hash joins with subqueries and to generate the corresponding query plan if that's the cheapest plan.  There is, however, one necessary drawback to these changes: since the optimizer is now considering additional joins where it didn't used to, compilation for complex subqueries could now take more per-join-order time.  This means two things: 1) for queries with very large and/or complex subqueries, the overall compilation time could increase in a sometimes significant way; and 2) because it can take longer for the optimizer to find the specific "best path" for a given join order (assuming one of the Optimizables in the join order is a large/complex subquery), it's theoretically possible that the optimizer may timeout before reaching join orders that it used to reach, leading to potentially worse execution plans.

For example, assume we have some query

  select <...> from t1, v1 where v1.xxx = t1.yyy ...

where v1 is a view that consitutes a large, non-flattenable subquery taking 100 units of time per "decorated" permutation to optimize, while t1 takes 1 unit of time per "decorated" permutation to optimize.  Assume there's an index on T1.  Prior to the changes for DERBY-781, the optimization may have gone something like this:
With the changes as I've described them in this document, the optimization would then be:
So the total time it takes to optimize the query in this case jumps from 206 units to 306 units--i.e. almost 50%.  If the subquery is fairly simple, this additional compilation time will be fairly negligible; but if it's a large and/or complicated subquery, the time to compile can grow significantly.

That said, it's worth noting that the testing I've done with some larger databases shows that the hit in compilation time typically leads to a much better execution plan and thus *much* less time spent in query execution, making for an overall query time that is still less than what Derby was seeing prior to the DERBY-781 changes.  If we combine that with the fact that a query is (usually) compiled once and then executed repeatedly thereafter, then it seems like these DERBY-781 changes are still a very good thing for customer applications.

Nonetheless, it would be a good future task to investigate how it might be possible to reduce the overall optimization time for subqueries--and in particular, some kind of optimizer "pruning" algorithm might be nice here.  I do not, however, attempt to resolve that problem as part of the DERBY-781 changes.

Further, let's assume that the "best path" for this query is join order {v1, t1} with hash join (using t1 as the inner table) and an estimated cost of 150 units.  We'll also say that {t1, v1} with nested loop join has a cost of 400 units and {t1, v1} with hash join (using subquery v1 as the inner table) has a cost of 200 units.

Prior to DERBY-781 the optimizer would have spent 102 units trying {t1, v1}, and then since the estimated cost (400) is greater than time spent so far (102) the optimizer would have continued on to consider {v1, t1}.  The total optimize time would then be 206 and the best path would have been (v1, t1) with a hash join, giving the estimated cost of 150 units.

With the DERBY-781 changes, the optimizer will spend 202 units trying (t1, v1) and would have come back with an estimated cost of 200 (doing a hash with v1).  Then, since the estimate cost (200) is less than the time spent so far (202) the optimizer will "timeout" and pick {t1, v1) with hash join as the best path.  In other words, the fact that optimizer must spend time considering the hash join means that it can end up timing out before it sees all of the join orders that it used to see, and therefore we can end up with a sub-optimal plan (in this case, a plan with cost 200 instead of a plan with cost 150).

So another future task is to see how we can allow the optimizer to consider hash joins without hitting this "timeout" behavior.

All of that said, I looked at DERBY-1357 (optimizer short-circuiting) and it looks as though the fix for that issue could go a long ways toward negating the extra compilation time incurred by DERBY-781.  This will be especially true for queries with complex subqueries that are part of FROM lists with several FromTables because proper short-circuit logic can allow the optimizer to skip useless optimization of the subqueries in some join orders, which can save a lot of time.  I have a fix for DERBY-1357 in my queue and plan to post that in the near future, as well.

And of course, I ran derbyall
with the changes as described in this document and the suite doesn't take any longer now than it does prior to my changes, which seems to confirm that the extra compilation time is only an issue when we have large subqueries that take a lot of time to optimize (of which derbyall apparently has very few).  And in cases where we *do* have large subqueries that take a lot of time to optimize, it seems likely that most of those queries will benefit more from subquery materialization than they will suffer from the one-time optimization hit.  Extra compilation time is not ideal, but it's also not a surprise since the DERBY-781 changes cause the optimizer to do "more stuff", which is naturally going to require more time.  From a customer standpoint I think the tradeoff is usually going to be well worth it--but future work might include figuring out how to prune out plans more aggressively to save overall optimization time.  That is, I think, another issue to be handled on another day.



Army, 06/30/2006