|
My trunk client is at revision 603368
$ svn info Path: . URL: https://svn.apache.org/repos/asf/db/derby/code/trunk Repository UUID: 13f79535-47bb-0310-9956-ffa450edef68 Revision: 603368 Node Kind: directory Schedule: normal Last Changed Author: kmarsden Last Changed Rev: 603301 Last Changed Date: 2007-12-11 09:03:01 -0800 (Tue, 11 Dec 2007) Properties Last Updated: 2007-11-26 09:22:05 -0800 (Mon, 26 Nov 2007) When I run the attached test case with revision 603368, I get incorrect results, ie only one count is returned back. $ java org.apache.derbyTesting.functionTests.tests.lang.DERBY_3257_Repro Database product: Apache Derby Database version: 10.4.0.0 alpha - (1) Driver name: Apache Derby Embedded JDBC Driver Driver version: 10.4.0.0 alpha - (1) result: 2 In
>Mike, I'll take a look at Manish, I was wondering if you have been able to make any progress on this issue. I'd be willing to help in any way I can, adding tests, committing the change etc. I am not sure exactly how much else I can offer. I thought I would try to educate myself by reading the notes attached to Attaching the query plans for 10.2 vs 10.4
Interestingly if I remove AND t0.CODE IS NOT NULL (there are no null codes to eliminate), I get the right result, two rows.
PreparedStatement pstmt = conn.prepareStatement("SELECT t0.CODE, COUNT(t0.ID) FROM CTS1.TEST_TABLE t0 GROUP BY t0.CODE HAVING (t0.CODE = ? OR t0.CODE = ?)"); I noticed that if the ordering of the predicates inside the HAVING clause is reversed, the query returns the "other" results, i.e:
-- Returns "2" SELECT COUNT(t0.ID) FROM TEST_TABLE t0 GROUP BY t0.CODE HAVING (t0.CODE = 'GBR' OR t0.CODE = 'CHA') AND t0.CODE IS NOT NULL; -- Returns "4" SELECT COUNT(t0.ID) FROM TEST_TABLE t0 GROUP BY t0.CODE HAVING (t0.CODE = 'CHA' OR t0.CODE = 'GBR') AND t0.CODE IS NOT NULL; With a little tracing it turns out that the HAVING clause is somehow getting messed up during preprocessing. More specifically, see line 883 in SelectNode.preprocess(): if (havingClause != null) { havingClause = havingClause.preprocess( numTables, fromList, havingSubquerys, wherePredicates); } Before the call to "havingClause.preprocess(...)" the HAVING clause is correct: havingClause ==> (OrNode AND NotNullNode) where: -- OrNode ==> (BinaryRelationalOperatorNode_0 OR BinaryRelationalOperatorNode_1) -- BinaryRelationalOperatorNode_0 ==> (t0.CODE = 'GBR') -- BinaryRelationalOperatorNode_1 ==> (t0.CODE = 'CHA') But after the call to that method, the OrNode has been replaced with its left operand: havingClause ==> (BinaryRelationalOperatorNode_0 AND NotNullNode) So we lose half of the OR condition, and that's why we end up with missing rows. A good start for this issue would be to set a break point inside the above "if" statement and then trace through the preprocessing logic to see when/how/why the OrNode gets replaced with its left operand... I started worrying that the underlying cause of this might be
It turns out that the OrNode preprocess() method expects everything to be normalized to Conjunctive Normal Form--and we do in fact normalize the WHERE as part of SelectNode.normExpressions(). But we do *NOT* currently normalize the HAVING clause, so when the OrNode in the HAVING clause was preprocessed, we were doing the wrong thing. I made a very quick change to normalize the HAVING clause, as well, and the queries posted to this issue now return correct results (2 rows). I have not run any other tests on it, though, so this might not be a valid fix. But it does indicate what the problem is to some degree... Why this worked before I'm calling my quick change d3257_doNOTCommit.patch because I haven't tested it at all (except for the queries in this issue). It should not be considered a viable approach until someone does more verification... Thanks Army for looking at this issue. I ran lang._Suite with the patch and see an interesting symptom that some queries are now throwing a 42X24 exception. e.g. in aggregate.sql
select c1 from t1 group by c1 having max(c2) in (select c1 from t2); ERROR 42X24: Column C1 is referenced in the HAVING clause but is not in the GROUP BY list. Thanks for the jump start though. I will investigate what is going on with this error with the patch. Here is the script to reproduce the 42X24 error message with the patch.
ij> run '42X24_error.sql'; ij> connect 'jdbc:derby:wombat;create=true'; WARNING 01J01: Database 'wombat' not created, connection made to existing database instead. ij> drop table t1; 0 rows inserted/updated/deleted ij> drop table t2; 0 rows inserted/updated/deleted ij> create table t1 (c1 int, c2 int); 0 rows inserted/updated/deleted ij> create table t2 (c1 int, c2 int); 0 rows inserted/updated/deleted ij> create table oneRow (c1 int, c2 int); ERROR X0Y32: Table/View 'ONEROW' already exists in Schema 'APP'. java.sql.SQLException: Table/View 'ONEROW' already exists in Schema 'APP'. at org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(SQLExceptionFactory.java:45) at org.apache.derby.impl.jdbc.Util.generateCsSQLException(Util.java:202) at org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(TransactionResourceImpl.java:391) at org.apache.derby.impl.jdbc.TransactionResourceImpl.handleException(TransactionResourceImpl.java:346) at org.apache.derby.impl.jdbc.EmbedConnection.handleException(EmbedConnection.java:1666) at org.apache.derby.impl.jdbc.ConnectionChild.handleException(ConnectionChild.java:81) at org.apache.derby.impl.jdbc.EmbedStatement.executeStatement(EmbedStatement.java:1324) at org.apache.derby.impl.jdbc.EmbedStatement.execute(EmbedStatement.java:624) at org.apache.derby.impl.jdbc.EmbedStatement.execute(EmbedStatement.java:556) at org.apache.derby.impl.tools.ij.ij.executeImmediate(ij.java:330) at org.apache.derby.impl.tools.ij.utilMain.doCatch(utilMain.java:508) at org.apache.derby.impl.tools.ij.utilMain.runScriptGuts(utilMain.java:350) at org.apache.derby.impl.tools.ij.utilMain.go(utilMain.java:248) at org.apache.derby.impl.tools.ij.Main.go(Main.java:215) at org.apache.derby.impl.tools.ij.Main.mainCore(Main.java:181) at org.apache.derby.impl.tools.ij.Main.main(Main.java:73) at org.apache.derby.tools.ij.main(ij.java:59) Caused by: ERROR X0Y32: Table/View 'ONEROW' already exists in Schema 'APP'. at org.apache.derby.iapi.error.StandardException.newException(StandardException.java:371) at org.apache.derby.impl.sql.catalog.DataDictionaryImpl.duplicateDescriptorException(DataDictionaryImpl.java:167 3) at org.apache.derby.impl.sql.catalog.DataDictionaryImpl.addDescriptor(DataDictionaryImpl.java:1664) at org.apache.derby.impl.sql.catalog.DataDictionaryImpl.addDescriptor(DataDictionaryImpl.java:1643) at org.apache.derby.impl.sql.execute.CreateTableConstantAction.executeConstantAction(CreateTableConstantAction.j ava:238) at org.apache.derby.impl.sql.execute.MiscResultSet.open(MiscResultSet.java:64) at org.apache.derby.impl.sql.GenericPreparedStatement.execute(GenericPreparedStatement.java:370) at org.apache.derby.impl.jdbc.EmbedStatement.executeStatement(EmbedStatement.java:1234) ... 10 more ij> insert into oneRow values(1,1); 1 row inserted/updated/deleted ij> insert into t1 values (null, null), (1,1), (null, null), (2,1), (3,1), (10,10); 6 rows inserted/updated/deleted ij> insert into t2 values (null, null), (1,1), (null, null), (2,1), (3,1), (10,10); 6 rows inserted/updated/deleted ij> select c1 from t1 group by c1 having max(c2) in (select c1 from t2); ERROR 42X24: Column C1 is referenced in the HAVING clause but is not in the GROUP BY list. java.sql.SQLException: Column C1 is referenced in the HAVING clause but is not in the GROUP BY list. at org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(SQLExceptionFactory.java:45) at org.apache.derby.impl.jdbc.Util.generateCsSQLException(Util.java:202) at org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(TransactionResourceImpl.java:391) at org.apache.derby.impl.jdbc.TransactionResourceImpl.handleException(TransactionResourceImpl.java:346) at org.apache.derby.impl.jdbc.EmbedConnection.handleException(EmbedConnection.java:1666) at org.apache.derby.impl.jdbc.ConnectionChild.handleException(ConnectionChild.java:81) at org.apache.derby.impl.jdbc.EmbedStatement.execute(EmbedStatement.java:613) at org.apache.derby.impl.jdbc.EmbedStatement.execute(EmbedStatement.java:556) at org.apache.derby.impl.tools.ij.ij.executeImmediate(ij.java:330) at org.apache.derby.impl.tools.ij.utilMain.doCatch(utilMain.java:508) at org.apache.derby.impl.tools.ij.utilMain.runScriptGuts(utilMain.java:350) at org.apache.derby.impl.tools.ij.utilMain.go(utilMain.java:248) at org.apache.derby.impl.tools.ij.Main.go(Main.java:215) at org.apache.derby.impl.tools.ij.Main.mainCore(Main.java:181) at org.apache.derby.impl.tools.ij.Main.main(Main.java:73) at org.apache.derby.tools.ij.main(ij.java:59) Caused by: ERROR 42X24: Column C1 is referenced in the HAVING clause but is not in the GROUP BY list. at org.apache.derby.iapi.error.StandardException.newException(StandardException.java:290) at org.apache.derby.impl.sql.compile.GroupByNode.addNewColumnsForAggregation(GroupByNode.java:529) at org.apache.derby.impl.sql.compile.GroupByNode.addAggregates(GroupByNode.java:237) at org.apache.derby.impl.sql.compile.GroupByNode.init(GroupByNode.java:181) at org.apache.derby.iapi.sql.compile.NodeFactory.getNode(NodeFactory.java:273) at org.apache.derby.impl.sql.compile.SelectNode.genProjectRestrict(SelectNode.java:1242) at org.apache.derby.impl.sql.compile.SelectNode.modifyAccessPaths(SelectNode.java:1816) at org.apache.derby.impl.sql.compile.DMLStatementNode.optimizeStatement(DMLStatementNode.java:307) at org.apache.derby.impl.sql.compile.CursorNode.optimizeStatement(CursorNode.java:515) at org.apache.derby.impl.sql.GenericStatement.prepMinion(GenericStatement.java:365) at org.apache.derby.impl.sql.GenericStatement.prepare(GenericStatement.java:88) at org.apache.derby.impl.sql.conn.GenericLanguageConnectionContext.prepareInternalStatement(GenericLanguageConne ctionContext.java:756) at org.apache.derby.impl.jdbc.EmbedStatement.execute(EmbedStatement.java:607) ... 9 more ij> I notice that if I avoid normExpressions for SubqueryNodes, things seem to work ok for both the test case and 42X24_error.sql. i.e.
if (! (havingClause instanceof SubqueryNode)) havingClause = normExpressions(havingClause); I am not sure that this is the right solution. I think I need a better understanding of why it is necessary for the havingClause to be in CNF before the call to preprocess and is it a problem to avoid that for SubqueryNodes. I am running suites.All to see if there are any issues but would appreciate input on whether I am on the right track here. Thanks Kathey TestHaving cannot be contributed to Apache. We will need to make another test. Reattaching without grant license.
> I think I need a better understanding of why it is necessary for the havingClause
> to be in CNF before the call to preprocess This is a good question--and perhaps in the long run this is not requirement per se. The only reason I thought to normalize the havingClause is because the OrNode.preprocess() method expects that the OrNode has been normalized. I.e. it expects that the predicate: [not normalized] => (t0.CODE = 'GBR' OR t0.CODE = 'CHA') becomes [normalized] => (t0.CODE = 'GBR' OR ((t0.CODE = 'CHA') OR FALSE)) When it transforms the OR list into an IN list, then, it just walks the tree and grabs the left operand from all of the chained ORs. So in the normalized case we grab: 1. t0.CODE = 'GBR' 2. t0.CODE = 'CHA' From those two predicates we then get "t0.CODE in ('GBR', 'CHA')", which is correct. If the HAVING clause is not normalized, though, then the logic in OrNode will only grab: 1. t0.CODE = 'GBR' because there is only one OrNode and that's its left operand. So we end up with "t0.CODE in ('GBR')" and thus we miss the row for 'CHA'. All of that said, maybe normalizing the HAVING clause is not the best solution, or perhaps it goes too far? Maybe OrNode should be changed to recognize if it is NON-normalized and should not try to transform itself in that case? If that's true, the follow-up question would be: Are there other places in the code that expect the having clause to be normalized, as well? It seems to me that something in > is it a problem to avoid [normalization] for SubqueryNodes? I would follow this up with two other questions: 1. What it is about normalization that causes problems for the SubqueryNode? It looks like the 42X24 error comes because the result column referenced within the subquery has a "source level" that is the same as the outer query--and that is (apparently?) because the SuqbueryNode somehow "disappears" as a result of the normalization. That suggests that perhaps the Subquery is being flattened during, or as a result of, normalization. Without normalization, the SubqueryNode hangs around and its result column gets a source level that is different from the outer query, so the check in GroupByNode.addNewColumnsForAggregation() passes. 2. What happens if the same nested query appears in a WHERE clause and the WHERE clause is normalized? Will the subquery be flattened during or after normalization? These two questions don't answer yours, but perhaps some investigation of them will lead to an answer. If the answer is "Yes, it's okay to avoid normalization for SubqueryNodes", then that leads to yet another question: are there other nodes for which it is okay to avoid normalization, as well? Just noticed that the code which is failing with 42X24 was added as part of
<begin> Basically I want to ensure that there are no column references in the havingClause as the same nesting level as this query. I think the comments with the code are adequate explanation. A new SQLCode is used for this error. Since error checking depends on nestingLevel I had to fix up code to make sure that it was set correctly in various places. <end> Not sure if this is relevant or not, but I thought I'd mention it since I was thinking the error was in code that had been around since before Army asked;
>1. What it is about normalization that causes problems for the SubqueryNode? In SubqueryNode.preProcess() there is this code flattenable = (resultSet instanceof SelectNode) && underTopAndNode && (isIN() || isANY() || isEXISTS() || flattenableNotExists || parentComparisonOperator != null); Without normalization underTopAndNode is false and so we don't flatten. Attached is a patch to fix this issue. derby-3257_diff.txt. It has Army's changes plus code to mark the subqueries that are in the having clause as such so that we can avoid flattenning during preprocess. We clearly are not setup to handle subquery flattenning within the having clause and just happenned to avoid it before because the having clause was not normalized. Adding the normalization meant we needed another mechanism to flag these subqueries to avoid flattenning. Perhaps a better long term strategy would be to allow for flattenning of subqueries in the having clause, but I am guessing that is a fairly significant endeavor.
Thank you for investigating this further, and for posting a complete patch, Kathey. I verified that the new test case passes with the changes and fails without them.
> We clearly are not setup to handle subquery flattenning within the having clause This is good to know. Just curious: is there something in the code that makes this limitation explicit, or does this statement just follow from the fact that attempts to flatten such a subquery lead to other errors? Very minor nits on the patch that you can ignore if you choose: - Might be good to add another bullet to the list in the comment preceding the following line, so that the comments and the code match each other. @@ -609,7 +615,7 @@ - underTopAndNode && + underTopAndNode && !havingSubquery && Likewise for the other, similar change further down in that method. - Some whitespace inconsistencies between new code and existing code. - Maybe the new test case in GroupByTest would be better called "testOrNodeInHavingClause()" instead of "testDerby3257?" This doesn't affect the test at all, and the comments for the fixture clearly indicate what is being tested, so this is definitely "nit-picking". But a meaningful method name seems (to me, as reader of the test) slightly more convenient than the Derby Jira number. But these nits aside, if you've run derbyall and suites.All with this change and there were no failures, I think the patch can be committed... Thanks Army for looking at the patch.
Attached is a revised patch derby-3257_diff2.txt with the changes you recommended. I will commit this afternoon if I don't hear anything back. You asked >Is there something in the code that makes this limitation explicit, or does this statement just follow from the fact that attempts to flatten such a subquery lead to other errors? Flattenning produces a column in the having clause that is not generated to replace an aggregate causing this condition to throw the exception: if (!cr.getGeneratedToReplaceAggregate() && cr.getSourceLevel() == level) { throw StandardException.newException( SQLState.LANG_INVALID_COL_HAVING_CLAUSE, cr.getSQLColumnName()); } I tried to understand why this condition was necessary by commenting it out and I ended up with a null pointer exception in the generated code when I had a select in the having clause. I didn't pursue it much further than that and figured Manish put that condition and error there for good reason. Perhaps there is a way to allow the flattenning of the subquery but I was not able to figure out how to do it so went with this solution. I'm hoping at some point. Manish or someone else can take a look at this and come up with a more elegant solution allowing the subqueries to be flattenned in the having clause. Fwiw: I applied the first version of your patch and verified that the repro worked correctly. I also ran all the tests with the patch. I saw no failures. +1 to commit. And thanks for addressing this - one less regression to worry about :)
Looks like there were no issues with the nightlies due to this change. I will check into 10.3 as soon as my tests complete.
Merged fix to 10.3 with revision 614304
Catching up on Closing my reported issues. Thanks to Dyre for the workflow reminder today.
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
DERBY-681), but returns incorrect results after. So it appears thatDERBY-681is at least partially at play here. Early (incremental) changes forDERBY-47might also be a factor, but I haven't confirmed one way or the other.