Issue Details (XML | Word | Printable)

Key: DERBY-2085
Type: Bug Bug
Status: Resolved Resolved
Resolution: Fixed
Priority: Trivial Trivial
Assignee: Bryan Pendleton
Reporter: Øystein Grøvlen
Votes: 0
Watchers: 0
Operations

If you were logged in you would be able to see more operations.
Derby

Misleading error message for non-matching ORDER BY clause in queries with GROUP BY.

Created: 15/Nov/06 09:42 AM   Updated: 30/Jun/09 03:55 PM
Return to search
Component/s: SQL
Affects Version/s: 10.2.1.6
Fix Version/s: 10.5.1.1

Time Tracking:
Not Specified

File Attachments:
  Size
File Licensed for inclusion in ASF works newMsgWithTest.diff 2008-04-26 06:34 PM Bryan Pendleton 8 kB
HTML File Licensed for inclusion in ASF works releaseNote.html 2009-04-01 09:29 AM Knut Anders Hatlen 2 kB
HTML File Licensed for inclusion in ASF works releaseNote.html 2009-03-31 05:15 PM Bryan Pendleton 2 kB
Environment: Any
Issue Links:
Reference
 

Issue & fix info: Release Note Needed
Bug behavior facts: Regression
Resolution Date: 08/May/08 04:03 AM


 Description  « Hide
In 10.2, this query gives the following error message:

ij> SELECT i FROM t GROUP BY i ORDER BY j;
ERROR 42Y30: The SELECT list of a grouped query contains at least one invalid expression. If a SELECT list has a GROUP BY, the list may only contain valid grouping expressions and valid aggregate expressions.

This is misleading since there is no invalid expression in the SELECT
list. It is the ORDER BY clause that is wrong.

I have marked this as an regression since the error message in 10.1 is
more helpful (but still not correct):

ij> SELECT i FROM t GROUP BY i ORDER BY j;
ERROR 42Y36: Column reference 'J' is invalid. For a SELECT list with a GROUP BY, the list may only contain grouping columns and valid aggregate expressions.


 All   Comments   Work Log   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Dyre Tjeldvoll added a comment - 07/Mar/08 04:04 PM
I believe the error message in question was modified in

------------------------------------------------------------------------
r437070 | djd | 2006-08-26 05:55:39 +0200 (Sat, 26 Aug 2006) | 2 lines

DERBY-883 Enhance GROUP BY clause to support expressions instead of just column references.
Patch contributed by Manish Khettry - manish_khettry@yahoo.com
------------------------------------------------------------------------

but I could not verify it because I'm no longer able to build revision 437069 with a modern jdk6.

Bryan Pendleton added a comment - 07/Mar/08 04:39 PM
Linking this to DERBY-2351, which is also concerned with
which error messages to issue for GROUP BY and ORDER BY problems.

Bryan Pendleton added a comment - 26/Apr/08 03:25 PM
I actually think the 10.1 message (42Y36) is pretty good. It identifies that the column
that is causing the problem is 'J', and it provides the rules for what column references
are legal in a grouped select. It's a bit vague about what "the list" is, but otherwise
I think it's a pretty good message.

So it should be just a matter of issuing this message in the case in question,
with the appropriate column name, and then updating any affected tests (as well
as adding a few new ones).

I'll see if I can work up a patch.

If anyone wishes to suggest an improved wording for 42Y36, please do, and I'd
be glad to include that in the patch as well.

Here's one possibility; is it any better?

ERROR 42Y36: Column reference 'J' is invalid. For a SELECT list with a GROUP BY, the column references may only contain grouping columns and valid aggregate expressions.

Bryan Pendleton added a comment - 26/Apr/08 06:31 PM
In a query such as this one, which of the two error messages (42Y30, 42Y36)
is superior?

  select c1+1, count(*) from test group by c1+2

Would we rather see:

  ERROR 42Y30: The SELECT list of a grouped query contains at least one invalid
  expression. If a SELECT list has a GROUP BY, the list may only contain valid
  grouping expressions and valid aggregate expressions.

or

  ERROR 42Y36: Column reference 'C1' is invalid. For a SELECT list with a
  GROUP BY,the list may only contain grouping columns and valid aggregate expressions

I think either message is ok, I don't feel a strong preference one way or
the other. However, the same code path is currently taken for this case
as is taken for the repro case:

  SELECT i FROM t GROUP BY i ORDER BY j

So if we feel that we need to get 42Y30 in the one case, but 42Y36 in the
other, we'll need to figure out how to get VerifyAggregateExpressionsVisitor
to be able to distinguish the one case from the other.

Bryan Pendleton added a comment - 26/Apr/08 06:34 PM
Attached is newMsgWithTest.diff, a patch proposal.

The patch changes VerifyAggregateExpressionsVisitor
to issue 42Y36 instead of 42Y30, in the case where
an invalid column reference is found in a grouped query.

The patch also adds some new error case tests to
GroupByTest, and changes some of the existing
error tests in GroupByTest and GroupByExpressionTest
to reflect that certain error cases now receive 42Y36
rather than 42Y30.

As I said in the previous comment on this issue, it is
not totally clear which message is superior in some of
these cases, so I'd like reviewers to examine the cases
where the message changed and provide some feedback
about which message they prefer.

Bryan Pendleton added a comment - 05/May/08 02:06 PM
Any opinions on this patch? All the tests pass, but since it changes the
error message that is returned, it would be nice to have some opinions
about which error message is preferable. I think the patch is a (small) improvement.

If no-one has any particular opinion, I'll commit this patch later this week.

Bryan Pendleton added a comment - 07/May/08 04:21 AM
Committed the patch to the trunk as revision 653988.

This patch doesn't seem important enough to me to merge
back to 10.4 or 10.3, so for now I'm only intending to commit
it to the trunk.

If, after we get more experience with the patch in the trunk,
people feel that we ought to merge it back to one or more
branches, it should be straightforward to do so.

Dag H. Wanvik added a comment - 07/May/08 04:10 PM
> select c1+1, count(*) from test group by c1+2
>
> Would we rather see:
>
> ERROR 42Y30: The SELECT list of a grouped query contains at least one invalid
> expression. If a SELECT list has a GROUP BY, the list may only contain valid
> grouping expressions and valid aggregate expressions.
>
> or
>
> ERROR 42Y36: Column reference 'C1' is invalid. For a SELECT list with a
> GROUP BY,the list may only contain grouping columns and valid aggregate expressions

I think the former message is better in this case, this it is the
expression "c1+1" that is invalid, the grouping expression being
"c1+2". The column c1 in itself is not the problem, which the message
indicates.

Tweaking the wording of 42Y36 may help, how about:

ERROR 42Y36: Column reference 'C1' is part of an expression that is
invalid. For a SELECT list with a GROUP BY, the list may only contain
grouping expressions and valid aggregate expressions.

Bryan Pendleton added a comment - 07/May/08 11:12 PM
Thanks Dag, I think that re-wording is an improvement. I'll have a look at making that change.

Bryan Pendleton added a comment - 07/May/08 11:13 PM
Reopen to investigate message re-wording.

Bryan Pendleton added a comment - 08/May/08 04:03 AM
Committed revised text for 42Y36 as revision 654404. The message
now is more broadly worded to make it clear that the column in
question may be either a simple column reference or may be part
of a larger expression.

Suran Jayathilaka added a comment - 30/Mar/09 05:53 PM
When runnig the test
org.apache.derbyTesting.functionTests.tests.lang.GroupByTest
in soft upgrade mode with 10.4.2.0 -> 10.5.1.0 (RC1) I came across these failures. The failures occur when the the test is run alone, as well as as part of suites.All.

35) testGroupByErrors(org.apache.derbyTesting.functionTests.tests.lang.GroupByTest)junit.framework.ComparisonFailure: Unexpected SQL state. expected:<42Y3[0]> but was:<42Y3[6]>
at org.apache.derbyTesting.junit.BaseJDBCTestCase.assertSQLState(BaseJDBCTestCase.java:752)
at org.apache.derbyTesting.junit.BaseJDBCTestCase.assertSQLState(BaseJDBCTestCase.java:801)
at org.apache.derbyTesting.junit.BaseJDBCTestCase.assertStatementError(BaseJDBCTestCase.java:970)
at org.apache.derbyTesting.junit.BaseJDBCTestCase.assertStatementError(BaseJDBCTestCase.java:992)
at org.apache.derbyTesting.functionTests.tests.lang.GroupByTest.testGroupByErrors(GroupByTest.java:244)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at org.apache.derbyTesting.junit.BaseTestCase.runBare(BaseTestCase.java:102)
at junit.extensions.TestDecorator.basicRun(TestDecorator.java:24)
at junit.extensions.TestSetup$1.protect(TestSetup.java:21)
at junit.extensions.TestSetup.run(TestSetup.java:25)
at org.apache.derbyTesting.junit.BaseTestSetup.run(BaseTestSetup.java:57)
Caused by: java.sql.SQLSyntaxErrorException: Column reference 'T1.A' is invalid, or is part of an invalid expression. For a SELECT list with a GROUP BY, the columns and expressions being selected may only contain valid grouping expressions and valid aggregate expressions.
at org.apache.derby.impl.jdbc.SQLExceptionFactory40.getSQLException(Unknown Source)
at org.apache.derby.impl.jdbc.Util.generateCsSQLException(Unknown Source)
at org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(Unknown Source)
at org.apache.derby.impl.jdbc.TransactionResourceImpl.handleException(Unknown Source)
at org.apache.derby.impl.jdbc.EmbedConnection.handleException(Unknown Source)
at org.apache.derby.impl.jdbc.ConnectionChild.handleException(Unknown Source)
at org.apache.derby.impl.jdbc.EmbedStatement.execute(Unknown Source)
at org.apache.derby.impl.jdbc.EmbedStatement.execute(Unknown Source)
at org.apache.derbyTesting.junit.BaseJDBCTestCase.assertStatementError(BaseJDBCTestCase.java:960)
... 29 more
Caused by: java.sql.SQLException: Column reference 'T1.A' is invalid, or is part of an invalid expression. For a SELECT list with a GROUP BY, the columns and expressions being selected may only contain valid grouping expressions and valid aggregate expressions.
at org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown Source)
at org.apache.derby.impl.jdbc.SQLExceptionFactory40.wrapArgsForTransportAcrossDRDA(Unknown Source)
... 38 more
Caused by: ERROR 42Y36: Column reference 'T1.A' is invalid, or is part of an invalid expression. For a SELECT list with a GROUP BY, the columns and expressions being selected may only contain valid grouping expressions and valid aggregate expressions.
at org.apache.derby.iapi.error.StandardException.newException(Unknown Source)
at org.apache.derby.impl.sql.compile.VerifyAggregateExpressionsVisitor.visit(Unknown Source)
at org.apache.derby.impl.sql.compile.QueryTreeNode.accept(Unknown Source)
at org.apache.derby.impl.sql.compile.ResultColumn.accept(Unknown Source)
at org.apache.derby.impl.sql.compile.QueryTreeNodeVector.accept(Unknown Source)
at org.apache.derby.impl.sql.compile.SelectNode.bindExpressions(Unknown Source)
at org.apache.derby.impl.sql.compile.DMLStatementNode.bindExpressions(Unknown Source)
at org.apache.derby.impl.sql.compile.DMLStatementNode.bind(Unknown Source)
at org.apache.derby.impl.sql.compile.CursorNode.bindStatement(Unknown Source)
at org.apache.derby.impl.sql.GenericStatement.prepMinion(Unknown Source)
at org.apache.derby.impl.sql.GenericStatement.prepare(Unknown Source)
at org.apache.derby.impl.sql.conn.GenericLanguageConnectionContext.prepareInternalStatement(Unknown Source)
... 32 more

Also in testDistinctGroupBy.

Should a release note have been made regarding this message change?

Thanks!
Suran

Bryan Pendleton added a comment - 31/Mar/09 04:59 PM
Hi Suran, thanks for catching this. I think a release note could be helpful here. I'll see if I can write one.

Bryan Pendleton added a comment - 31/Mar/09 05:15 PM
Proposed release note.

Knut Anders Hatlen added a comment - 01/Apr/09 09:24 AM
The release note looks clear and well written to me. (One tiny inconsistency: The create table statement is in lower case and does not end with a semicolon, whereas the select statement is in upper case and is terminated by a semicolon.)

Knut Anders Hatlen added a comment - 01/Apr/09 09:29 AM
I took the liberty of updating the release note. Changed the create table statement to upper case and removed the semicolon from the select statement.