Issue Details (XML | Word | Printable)

Key: DERBY-4063
Type: Bug Bug
Status: Resolved Resolved
Resolution: Fixed
Priority: Major Major
Assignee: Unassigned
Reporter: Lars Gråmark
Votes: 0
Watchers: 1
Operations

If you were logged in you would be able to see more operations.
Derby

Constraint causes wrong query result when using exists

Created: 18/Feb/09 12:25 PM   Updated: 01/Jul/09 04:25 PM
Component/s: SQL
Affects Version/s: 10.4.2.0
Fix Version/s: 10.3.3.1, 10.3.4.0, 10.4.2.1, 10.4.3.0, 10.5.1.1

Time Tracking:
Not Specified

File Attachments:
  Size
Text File Licensed for inclusion in ASF works noconst_nocond.txt 2009-02-19 04:29 PM Lars Gråmark 8 kB
Text File Licensed for inclusion in ASF works withconst_nocond.txt 2009-02-19 04:30 PM Lars Gråmark 8 kB
Text File Licensed for inclusion in ASF works withconst_withcond.txt 2009-02-19 04:31 PM Lars Gråmark 8 kB
Environment: Running Sun JVM 1.6.0_10
Issue Links:
Reference

Issue & fix info: High Value Fix
Bug behavior facts: Wrong query result, Regression
Resolution Date: 19/Mar/09 05:15 PM


 Description  « Hide
Enabling the primary key constraint yields different results in an exists sub-select.

The select statement below will return the values 1 and 3 when the primary key constraint is disabled in the project table (project_pk).
When the constraint is enabled, the same query returns nothing.

Another interesting effect on the result can be observed when the criteria "AND prj.other = 100" is enabled
in the join clause and when the constraint is enabled.


drop table child;
drop table parent;
drop table project;

CREATE TABLE project (id INT NOT NULL, other INT NOT NULL
--,CONSTRAINT project_pk PRIMARY KEY (id)
);
CREATE TABLE parent (id INT NOT NULL, project_id INT NOT NULL);
CREATE TABLE child (id INT NOT NULL, parent_id INT NOT NULL);

insert into project (id, other) values(50,100);
insert into parent(id, project_id) values (10,50);
insert into parent(id, project_id) values (20,50);
insert into child(id, parent_id) values(1,10);
insert into child(id, parent_id) values(2,20);
insert into child(id, parent_id) values(3,20);

SELECT c0.id
FROM child c0
WHERE EXISTS (
   SELECT MAX(c1.id)
   FROM child c1
   JOIN parent p ON p.id = c1.parent_id
   JOIN project prj ON prj.id = p.project_id
   --AND prj.other = 100
   GROUP BY c1.parent_id
   HAVING MAX(c1.id) = c0.id
);


 All   Comments   Work Log   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Bryan Pendleton added a comment - 19/Feb/09 04:02 AM
This seems like an interesting problem. It seems possible that it might
have to do with subquery flattening or group by re-writing. But that's just a wild guess,
I haven't studied the problem in any depth.

Some questions:
 - Have you tried this script with other versions of Derby to see if the behavior varies?
 - Can you collect the query plans for the various combinations (constraint/no-constraint,
   prj.other=100/no-prj.other=100), and perhaps do some initial comparisons to see
   how the query plans differ? To do this, you'll want to set derby.language.logQueryPlan=true,
   as described here: http://db.apache.org/derby/docs/dev/tuning/ctunsetprop34818.html

Lars Gråmark added a comment - 19/Feb/09 04:27 PM
I've compared with 10.2.2.0 and the problem does NOT appear in that version. Cant really say when the problem started to appear.

I attached three files to this issue that shows the query plan for:
1. Query with no constraint and with disabled condition (noconst_nocond.txt). Query result is correct.
2. Query with constraint and disabled condition (withconst_nocond.txt). Query result is empty and incorrect.
3. Query with constraint and enabled condition (withconst_withcond.txt). Query result is correct.

The query plans from (1) and (3) are very similar except from the operation being considered in (3).
The query plans differ quite much between (1) and (2) but I'm not really confident in the query plan format so I'll leave the analysis to you. What I can see is that optimizer estimation is a bit off the track in most places and that an index scan on PROJECT_PK results in zero rows in (2). You can easily compare the files using meld or some other text-diff tool.
Hope this helps.

Lars Gråmark added a comment - 19/Feb/09 04:29 PM
Constraint disabled. No prj.other condition.

Lars Gråmark added a comment - 19/Feb/09 04:30 PM
Constraint enabled. No prj.other condition.

Lars Gråmark added a comment - 19/Feb/09 04:31 PM
Constraint enabled. With extra condition prj.other=100

Bryan Pendleton added a comment - 19/Feb/09 04:54 PM
Thanks for providing the query plans.

Since the problem does not appear for you in 10.2, it's probably not due to DERBY-883,
because I believe that was merged back to 10.2. However it could be related to DERBY-681.

Knut Anders Hatlen added a comment - 21/Feb/09 07:11 PM
Looks like Bryan's second guess is correct. The behaviour changed in revision 516454 (DERBY-681).

Dag H. Wanvik added a comment - 19/Mar/09 05:15 PM
It appears that this bug is fixed by 711615 (DERBY-3880 fix), at least I
can not reproduce it after that change, so resolving. That is not unreasonable,
since that bug also fixes a regression from DERBY-681 and concerns a JOIN issue.

Lars, if you are able to build Derby from the trunk of the 10.4 branch to verify that
the problem is gone, that would be great!


Dag H. Wanvik added a comment - 19/Mar/09 05:16 PM
Related or duplicated issue; seems solved by fix to DERBY-3880.

Dag H. Wanvik added a comment - 19/Mar/09 05:24 PM
I verified that the patch from DERBY-3880 fixes the error on both trunk, 10.4 and 10.3 branches,
so marking fixed in all those.