Issue Details (XML | Word | Printable)

Key: DERBY-3997
Type: Bug Bug
Status: Resolved Resolved
Resolution: Fixed
Priority: Major Major
Assignee: Knut Anders Hatlen
Reporter: geoff hendrey
Votes: 0
Watchers: 0
Operations

If you were logged in you would be able to see more operations.
Derby

ORDER BY causes column to be returned

Created: 24/Dec/08 06:08 AM   Updated: 22/Jun/09 07:15 PM
Component/s: SQL
Affects Version/s: 10.1.3.1, 10.2.2.0, 10.3.3.0, 10.4.1.3
Fix Version/s: 10.3.3.1, 10.4.2.1, 10.5.1.1

Time Tracking:
Not Specified

File Attachments:
  Size
File Licensed for inclusion in ASF works d3997.diff 2008-12-30 09:05 AM Knut Anders Hatlen 2 kB
File Licensed for inclusion in ASF works d3997.stat 2008-12-30 09:05 AM Knut Anders Hatlen 0.2 kB
Environment: Mac OS 10.4, JDK 1,6
Issue Links:
dependent
 

Urgency: Normal
Resolution Date: 05/Jan/09 03:32 PM


 Description  « Hide
The ORDER BY is causing the ordered column to be retrieved even though it is not part of the SELECT clause. Here is a script to create a table, insert a row, and perform the select:


CREATE TABLE "REVIEWS"."GEOFF__REVIEWS__REVIEW"
(
   PK INTEGER PRIMARY KEY not null,
   numstars BIGINT,
   body VARCHAR(32672),
   title VARCHAR(32672),
   authoremail VARCHAR(32672)
);

INSERT INTO "REVIEWS"."GEOFF__REVIEWS__REVIEW" (PK,numstars,body,title,authoremail) VALUES (0 /*not nullable*/,0,'s','s','s');

SELECT "review"."numstars"
FROM
    "GEOFF__REVIEWS__REVIEW" AS "review"
WHERE
        "review"."PK" = 1
ORDER BY
    "review".PK


 All   Comments   Work Log   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Knut Anders Hatlen added a comment - 24/Dec/08 10:33 PM
I managed to reproduce it with an even simpler script:

ij> create table t (x int primary key, y int);
0 rows inserted/updated/deleted
ij> select y from t where x = 1 order by x;
Y |X
-----------------------

0 rows selected

Remove the WHERE clause or the ORDER BY clause, and the column X disappears.

Bryan Pendleton added a comment - 24/Dec/08 10:47 PM
Thanks Geoff and Knut, these scripts are great. Perhaps the common element
is that in all these scripts, the "extra" column is in both the WHERE and ORDER BY clauses?

geoff hendrey added a comment - 29/Dec/08 07:45 AM
Are you able to provide a target release for a fix? It would be a big help to know when a fix will come in as this bug causes a problem for me in a production environment.

Knut Anders Hatlen added a comment - 29/Dec/08 08:03 PM
I think I've found the cause of the problem. This code in SelectNode.preprocess() will be executed when all the columns in the ORDER BY clause are known to have constant values because of the WHERE clause:

/*
** It's possible for the order by list to shrink to nothing
** as a result of removing constant columns. If this happens,
** get rid of the list entirely.
*/
if (orderByList.size() == 0)
{
orderByList = null;
}

Later, in SelectNode.genProjectRestrict() the extra ORDER BY columns are supposed to be removed from the result, but this is only done if orderByList != null, as it assumes that (orderByList == null) means that there are no columns to remove. Calling resultColumns.removeOrderByColumns() when we set orderByList to null appears to fix the problem, but I haven't run the regression tests yet.

Bryan Pendleton added a comment - 30/Dec/08 12:27 AM
Thanks Knut for having a go at this. Your analysis sounds quite good to me.

One interesting variant will be when the select list contains *, as I think there are
some complexities about whether you can remove the ORDER BY column in
this case, since it should end up in the result after all.

So it might be worth ensuring that you try a few test cases including select * queries
when testing the patch.

Knut Anders Hatlen added a comment - 30/Dec/08 09:05 AM
Here's a patch with a fix and a test case. Both suites.All and derbyall ran cleanly.

Thanks for the suggestion about * in the select list, Bryan. I tested that manually, and I didn't find any problem. ResultColumnList.removeOrderByColumns() should only remove columns that have been added for the order by, and it should leave the original select list intact. I haven't added any test case to ensure that elimination of the order by clause doesn't remove columns that should have been in the result, as that seems to be tested already.

Bryan Pendleton added a comment - 30/Dec/08 04:01 PM
Hi Knut, the patch looks fine to me. I verified that the new test fails
without the change to SelectNode.java and passes with the change.

I also tried a few other test cases, and confirmed that their results
look reasonable. I'm not sure that they add anything all that valuable
to the patch, but for completeness here's the new cases I tried:

1) selecting an actual column from the table rather than a constant expression
2) selecting * from the table
3) selecting from the table with a WHERE clause that does NOT reduce
the ORDER BY to a a constant.

If you think they're useful enough to add to the patch, go for it, otherwise
I think you can just commit the patch as is.

Thanks again for picking this issue up!

ij> create table d3997_a(a int, b int, c int);
0 rows inserted/updated/deleted
ij> insert into d3997_a values (1,2,3);
1 row inserted/updated/deleted
ij> select a from d3997_a where b = 2 order by b;
A
-----------
1

1 row selected
ij> select * from d3997_a where b != 2 order by b;
A |B |C
-----------------------------------

0 rows selected
ij> select c from d3997_a where b > 1 order by b;
C
-----------
3

1 row selected

Knut Anders Hatlen added a comment - 30/Dec/08 05:22 PM
Thanks for looking at the patch, Bryan, and for the extra test cases.

I added some variations of your test cases and a test case for dynamic columns (see below) and committed revision 730188. I plan to back-port the fix to 10.4 so I'm leaving the issue open.

-- DERBY-3997: Elimination of ORDER BY clause because all the columns
-- to order by were known to be constant, made extra columns appear in
-- the result.
create table d3997(x int, y int, z int);
-- These queries used to have two result columns, but should only have one
select 1 from d3997 where x=1 order by x;
select y from d3997 where x=1 order by x;
-- Used to have three columns, should only have two
select y,z from d3997 where x=1 order by x;
-- Used to have three columns, should only have one
select x from d3997 where y=1 and z=1 order by y,z;
-- Dynamic parameters are also constants (expect one column)
execute 'select x from d3997 where y=? order by y' using 'values 1';
-- Order by columns should not be removed from the result here
select * from d3997 where x=1 order by x;
select x,y,z from d3997 where x=1 order by x;
select x,y,z from d3997 where x=1 and y=1 order by x,y;
-- Order by should not be eliminated here (not constant values). Insert some
-- data in reverse order to verify that the results are sorted.
insert into d3997 values (9,8,7),(6,5,4),(3,2,1);
select * from d3997 where y<>2 order by y;
select z from d3997 where y>2 order by y;
drop table d3997;

geoff hendrey added a comment - 01/Jan/09 06:25 PM
Thanks guys. Do we know when a derby binary release will include this fix?

Knut Anders Hatlen added a comment - 01/Jan/09 10:40 PM
There has been some talk about producing a 10.5 release in the first quarter of 2009, but no target date has been decided yet.
http://mail-archives.apache.org/mod_mbox/db-derby-dev/200811.mbox/%3C491C864E.6010008@sun.com%3E

Knut Anders Hatlen added a comment - 05/Jan/09 03:32 PM
Merged the fix to 10.4 and committed revision 731599.

Kathey Marsden added a comment - 18/Jun/09 06:08 PM
This exists back to 10.1, fixing affects version

Kathey Marsden added a comment - 22/Jun/09 06:28 PM
I am backporting this patch to 10.2 and notice that it uses ResultColumnList.removeOderByColumns which was added in DERBY-681. Just adding the one method to the 10.2 ResultColumnList seems to make everything work ok and orderByElimination passes, but I just thought I would mention it in case it seems like a problem to anyone. I will run the full regression tests.


Kathey Marsden added a comment - 22/Jun/09 07:15 PM
checked into 10.3. This will still go to 10.2 and 10.1