|
[
Permlink
| « Hide
]
Knut Anders Hatlen added a comment - 01/Aug/09 11:15 PM
Attached a script that creates a sample database and executes the problematic query.
I try the same query without subquery and the result is order.
ij> SELECT CS.ID FROM CHANGESETS CS, FILECHANGES FC, REPOSITORIES R, FILES F, AUTHORS A WHERE R.PATH = '/var/tmp/source5923202038296723704opengrok/mercurial' AND F.REPOSITORY = R.ID AND A.REPOSITORY = R.ID AND CS.REPOSITORY = R.ID AND CS.ID = FC.CHANGESET AND F.ID = FC.FILE AND A.ID = CS.AUTHOR ORDER BY CS.ID DESC; ID ----------- 3 3 2 2 2 1 6 rows selected Hopefully, this will narrow the join logic a little bit. Calling SYSCS_UTIL.SYSCS_UPDATE_STATISTICS on the tables involved also makes the result come out in the right order, so it seems like the bug only appears when a certain plan is chosen.
The problem does not occur when I revert to revision 783167, but it does occur with revision 783168, which suggests that it was triggered by One thing I don't understand about this regression is the description of the fix for
" The decision for avoiding sorting should also include relationship between the optimizables in a given join order " so how did that cause a new sort avoidance problem? The comments also say: "I think this patch is also improving the existing queries to include a better path than what it was picking up before." Is it a change in plan that is causing the regression, perhaps choosing a plan that had always ordered incorrectly? If anyone has time, what would be useful is the smallest test case possible with a set of optimizer hints that forces the bad result. This will help in understanding exactly what is going on.
If I remember correctly, the all the plans. It turned out that some of the plans it eliminated actually did return the correct order but just through luck. So the question with this bug, which is obviously affected by now mark a plan we previously skipped as valid. If the repro had a fixed join plan that showed bad sort now and good sort before we could more easily tell what the regression is. here is the query plan from the bad sort results query, gotten by adding derby.language.logQueryPlan=true
to derby.properties. Posting now, have not read through it yet - but first thing to notice is that there is not any sort node so as assumed the problem is not a sorter bug - but some sort of sort avoidance plan bug: 2009-08-03 18:09:06.296 GMT Thread[main,5,main] (XID = 259), (SESSIONID = 3), SELECT CS.ID^M FROM^M CHANGESETS CS, FILECHANGES FC, REPOSITORIES R, FILES F, AUTHORS A^M WHERE^M R.PATH = '/var/tmp/source5923202038296723704opengrok/mercurial' AND ^M F.REPOSITORY = R.ID AND ^M A.REPOSITORY = R.ID AND^M CS.REPOSITORY = R.ID AND ^M CS.ID = FC.CHANGESET AND ^M F.ID = FC.FILE AND^M A.ID = CS.AUTHOR AND^M EXISTS (^M SELECT 1^M FROM FILES F2^M WHERE^M F2.ID = FC.FILE AND F2.REPOSITORY = R.ID AND^M F2.PATH LIKE '/%' ESCAPE '#')^M ORDER BY CS.ID DESC ******* Project-Restrict ResultSet (18): Number of opens = 1 Rows seen = 6 Rows filtered = 0 restriction = false projection = true constructor time (milliseconds) = 0 open time (milliseconds) = 0 next time (milliseconds) = 0 close time (milliseconds) = 0 restriction time (milliseconds) = 0 projection time (milliseconds) = 0 optimizer estimated row count: 0.01 optimizer estimated cost: 157.72 Source result set: Nested Loop Exists Join ResultSet: Number of opens = 1 Rows seen from the left = 6 Rows seen from the right = 6 Rows filtered = 0 Rows returned = 6 constructor time (milliseconds) = 0 open time (milliseconds) = 0 next time (milliseconds) = 0 close time (milliseconds) = 0 optimizer estimated row count: 0.01 optimizer estimated cost: 157.72 Left result set: Nested Loop Join ResultSet: Number of opens = 1 Rows seen from the left = 6 Rows seen from the right = 6 Rows filtered = 0 Rows returned = 6 constructor time (milliseconds) = 0 open time (milliseconds) = 0 next time (milliseconds) = 0 close time (milliseconds) = 0 optimizer estimated row count: 0.01 optimizer estimated cost: 157.10 Left result set: Nested Loop Join ResultSet: Number of opens = 1 Rows seen from the left = 16 Rows seen from the right = 6 Rows filtered = 0 Rows returned = 6 constructor time (milliseconds) = 0 open time (milliseconds) = 0 next time (milliseconds) = 0 close time (milliseconds) = 0 optimizer estimated row count: 0.02 optimizer estimated cost: 155.23 Left result set: Nested Loop Join ResultSet: Number of opens = 1 Rows seen from the left = 4 Rows seen from the right = 16 Rows filtered = 0 Rows returned = 16 constructor time (milliseconds) = 0 open time (milliseconds) = 0 next time (milliseconds) = 0 close time (milliseconds) = 0 optimizer estimated row count: 0.20 optimizer estimated cost: 154.53 Left result set: Nested Loop Join ResultSet: Number of opens = 1 Rows seen from the left = 1 Rows seen from the right = 4 Rows filtered = 0 Rows returned = 4 constructor time (milliseconds) = 0 open time (milliseconds) = 0 next time (milliseconds) = 0 close time (milliseconds) = 0 optimizer estimated row count: 0.22 optimizer estimated cost: 146.81 Left result set: Index Row to Base Row ResultSet for REPOSITORIES: Number of opens = 1 Rows seen = 1 Columns accessed from heap = {0} constructor time (milliseconds) = 0 open time (milliseconds) = 0 next time (milliseconds) = 0 close time (milliseconds) = 0 optimizer estimated row count: 1.00 optimizer estimated cost: 135.64 Index Scan ResultSet for REPOSITORIES using constraint SQL090803110642671 at read committed isolation level using share ro w locking chosen by the optimizer Number of opens = 1 Rows seen = 1 Rows filtered = 0 Fetch Size = 1 constructor time (milliseconds) = 0 open time (milliseconds) = 0 next time (milliseconds) = 0 close time (milliseconds) = 0 next time in milliseconds/row = 0 scan information: Bit set of columns fetched=All Number of columns fetched=2 Number of deleted rows visited=0 Number of pages visited=1 Number of rows qualified=1 Number of rows visited=1 Scan type=btree Tree height=-1 start position: >= on first 1 column(s). Ordered null semantics on the following columns: 0 stop position: > on first 1 column(s). Ordered null semantics on the following columns: 0 qualifiers: None optimizer estimated row count: 1.00 optimizer estimated cost: 135.64 Right result set: Index Row to Base Row ResultSet for FILES: Number of opens = 1 Rows seen = 4 Columns accessed from heap = {0, 1, 2} constructor time (milliseconds) = 0 open time (milliseconds) = 0 next time (milliseconds) = 0 close time (milliseconds) = 0 optimizer estimated row count: 0.22 optimizer estimated cost: 11.17 Index Scan ResultSet for FILES using constraint SQL090803110643012 at read committed isolation level using share row locki ng chosen by the optimizer Number of opens = 1 Rows seen = 4 Rows filtered = 0 Fetch Size = 1 constructor time (milliseconds) = 0 open time (milliseconds) = 0 next time (milliseconds) = 0 close time (milliseconds) = 0 next time in milliseconds/row = 0 scan information: Bit set of columns fetched=All Number of columns fetched=3 Number of deleted rows visited=0 Number of pages visited=1 Number of rows qualified=4 Number of rows visited=4 Scan type=btree Tree height=1 start position: >= on first 2 column(s). Ordered null semantics on the following columns: 0 1 stop position: >= on first 2 column(s). Ordered null semantics on the following columns: 0 1 qualifiers: None optimizer estimated row count: 0.22 optimizer estimated cost: 11.17 Right result set: Index Row to Base Row ResultSet for FILES: Number of opens = 4 Rows seen = 16 Columns accessed from heap = {0, 2} constructor time (milliseconds) = 0 open time (milliseconds) = 0 next time (milliseconds) = 0 close time (milliseconds) = 0 optimizer estimated row count: 0.20 optimizer estimated cost: 7.72 Index Scan ResultSet for FILES using constraint SQL090803110643012 at read committed isolation level using share row locking c hosen by the optimizer Number of opens = 4 Rows seen = 16 Rows filtered = 0 Fetch Size = 1 constructor time (milliseconds) = 0 open time (milliseconds) = 0 next time (milliseconds) = 0 close time (milliseconds) = 0 next time in milliseconds/row = 0 scan information: Bit set of columns fetched={0, 2} Number of columns fetched=2 Number of deleted rows visited=0 Number of pages visited=4 Number of rows qualified=16 Number of rows visited=16 Scan type=btree Tree height=1 start position: >= on first 1 column(s). Ordered null semantics on the following columns: 0 stop position: > on first 1 column(s). Ordered null semantics on the following columns: 0 qualifiers: Column[0][0] Id: 0 Operator: = Ordered nulls: false Unknown return value: false Negate comparison result: false optimizer estimated row count: 0.20 optimizer estimated cost: 7.72 Right result set: Index Scan ResultSet for FILECHANGES using constraint SQL090803110644463 at read committed isolation level using instantaneous share r ow locking chosen by the optimizer Number of opens = 16 Rows seen = 6 Rows filtered = 0 Fetch Size = 16 constructor time (milliseconds) = 0 open time (milliseconds) = 0 next time (milliseconds) = 0 close time (milliseconds) = 0 next time in milliseconds/row = 0 scan information: Bit set of columns fetched={0, 1} Number of columns fetched=2 Number of deleted rows visited=0 Number of pages visited=16 Number of rows qualified=6 Number of rows visited=36 Scan type=btree Tree height=1 start position: >= on first 1 column(s). Ordered null semantics on the following columns: 0 stop position: > on first 1 column(s). Ordered null semantics on the following columns: 0 qualifiers: Column[0][0] Id: 0 Operator: = Ordered nulls: false Unknown return value: false Negate comparison result: false optimizer estimated row count: 0.02 optimizer estimated cost: 0.70 Right result set: Index Row to Base Row ResultSet for AUTHORS: Number of opens = 6 Rows seen = 6 Columns accessed from heap = {0, 1} constructor time (milliseconds) = 0 open time (milliseconds) = 0 next time (milliseconds) = 0 close time (milliseconds) = 0 optimizer estimated row count: 0.01 optimizer estimated cost: 1.87 Index Scan ResultSet for AUTHORS using constraint SQL090803110643392 at read committed isolation level using share row locking chosen by the optimizer Number of opens = 6 Rows seen = 6 Rows filtered = 0 Fetch Size = 1 constructor time (milliseconds) = 0 open time (milliseconds) = 0 next time (milliseconds) = 0 close time (milliseconds) = 0 next time in milliseconds/row = 0 scan information: Bit set of columns fetched={0, 2} Number of columns fetched=2 Number of deleted rows visited=0 Number of pages visited=6 Number of rows qualified=6 Number of rows visited=6 Scan type=btree Tree height=1 start position: >= on first 1 column(s). Ordered null semantics on the following columns: 0 stop position: > on first 1 column(s). Ordered null semantics on the following columns: 0 qualifiers: Column[0][0] Id: 0 Operator: = Ordered nulls: false Unknown return value: false Negate comparison result: false optimizer estimated row count: 0.01 optimizer estimated cost: 1.87 Right result set: Project-Restrict ResultSet (17): Number of opens = 6 Rows seen = 6 Rows filtered = 0 restriction = true projection = true constructor time (milliseconds) = 0 open time (milliseconds) = 0 next time (milliseconds) = 0 close time (milliseconds) = 0 restriction time (milliseconds) = 0 projection time (milliseconds) = 0 optimizer estimated row count: 0.01 optimizer estimated cost: 0.62 Source result set: Index Row to Base Row ResultSet for CHANGESETS: Number of opens = 6 Rows seen = 6 Columns accessed from heap = {1, 3} constructor time (milliseconds) = 0 open time (milliseconds) = 0 next time (milliseconds) = 0 close time (milliseconds) = 0 optimizer estimated row count: 0.01 optimizer estimated cost: 0.62 Index Scan ResultSet for CHANGESETS using constraint SQL090803110643850 at read committed isolation level using share row locking chos en by the optimizer Number of opens = 6 Rows seen = 6 Rows filtered = 0 Fetch Size = 1 constructor time (milliseconds) = 0 open time (milliseconds) = 0 next time (milliseconds) = 0 close time (milliseconds) = 0 next time in milliseconds/row = 0 scan information: Bit set of columns fetched=All Number of columns fetched=2 Number of deleted rows visited=0 Number of pages visited=6 Number of rows qualified=6 Number of rows visited=6 Scan type=btree Tree height=1 start position: >= on first 1 column(s). Ordered null semantics on the following columns: 0 stop position: > on first 1 column(s). Ordered null semantics on the following columns: 0 qualifiers: None optimizer estimated row count: 0.01 optimizer estimated cost: 0.62 I will start looking at this bug. It will be great if the query can be made simpler and/or use optimizer overrides to make it always break so the debugger does not go through many iterations through the optimizer. I will post more info as I am progress.
I added names to all the constraints/primary keys to the original repro.sql script to make it easier to understand what is going on when looking at query plans.
Here is query plan for bad sort result as generated by repro2.sql
Wanted to share that if the same query is run with the optimizer ovreride of join order fixed, the query does not give incorrect results. The query plan at this time is as follows
Statement Name: null Statement Text: SELECT CS.ID FROM --DERBY-PROPERTIES joinOrder=FIXED CHANGESETS CS, FILECHANGES FC, REPOSITORIES R, FILES F, AUTHORS A WHERE R.PATH = '/var/tmp/source5923202038296723704opengrok/mercurial' AND F.REPOSITORY = R.ID AND A.REPOSITORY = R.ID AND CS.REPOSITORY = R.ID AND CS.ID = FC.CHANGESET AND F.ID = FC.FILE AND A.ID = CS.AUTHOR AND EXISTS ( SELECT 1 FROM FILES F2 WHERE F2.ID = FC.FILE AND F2.REPOSITORY = R.ID AND F2.PATH LIKE '/%' ESCAPE '#') ORDER BY CS.ID DESC Parse Time: 0 Bind Time: 0 Optimize Time: 0 Generate Time: 0 Compile Time: 0 Execute Time: 0 Begin Compilation Timestamp : null End Compilation Timestamp : null Begin Execution Timestamp : null End Execution Timestamp : null Statement Execution Plan Text: Sort ResultSet: Number of opens = 1 Rows input = 6 Rows returned = 6 Eliminate duplicates = false In sorted order = false Sort information: Number of rows input=6 Number of rows output=6 Sort type=internal constructor time (milliseconds) = 0 open time (milliseconds) = 0 next time (milliseconds) = 0 close time (milliseconds) = 0 optimizer estimated row count: 0.08 optimizer estimated cost: 553.68 Source result set: Project-Restrict ResultSet (20): Number of opens = 1 Rows seen = 6 Rows filtered = 0 restriction = false projection = true constructor time (milliseconds) = 0 open time (milliseconds) = 0 next time (milliseconds) = 0 close time (milliseconds) = 0 restriction time (milliseconds) = 0 projection time (milliseconds) = 0 optimizer estimated row count: 0.08 optimizer estimated cost: 553.68 Source result set: User supplied optimizer overrides for join are { joinOrder=FIXED } Nested Loop Join ResultSet: Number of opens = 1 Rows seen from the left = 6 Rows seen from the right = 6 Rows filtered = 0 Rows returned = 6 constructor time (milliseconds) = 0 open time (milliseconds) = 0 next time (milliseconds) = 0 close time (milliseconds) = 0 optimizer estimated row count: 0.08 optimizer estimated cost: 553.68 Left result set: User supplied optimizer overrides for join are { joinOrder=FIXED } Nested Loop Join ResultSet: Number of opens = 1 Rows seen from the left = 6 Rows seen from the right = 6 Rows filtered = 0 Rows returned = 6 constructor time (milliseconds) = 0 open time (milliseconds) = 0 next time (milliseconds) = 0 close time (milliseconds) = 0 optimizer estimated row count: 0.36 optimizer estimated cost: 546.94 Left result set: User supplied optimizer overrides for join are { joinOrder=FIXED } Nested Loop Exists Join ResultSet: Number of opens = 1 Rows seen from the left = 6 Rows seen from the right = 6 Rows filtered = 0 Rows returned = 6 constructor time (milliseconds) = 0 open time (milliseconds) = 0 next time (milliseconds) = 0 close time (milliseconds) = 0 optimizer estimated row count: 0.60 optimizer estimated cost: 475.57 Left result set: User supplied optimizer overrides for join are { joinOrder=FIXED } Nested Loop Exists Join ResultSet: Number of opens = 1 Rows seen from the left = 6 Rows seen from the right = 6 Rows filtered = 0 Rows returned = 6 constructor time (milliseconds)= 0 open time (milliseconds) = 0 next time (milliseconds) = 0 close time (milliseconds) = 0 optimizer estimated row count: 0.60 optimizer estimated cost: 454.07 Left result set: User supplied optimizer overrides for join are { joinOrder=FIXED } Hash Join ResultSet: Number of opens = 1 Rows seen from the left = 3 Rows seen from the right = 6 Rows filtered = 0 Rows returned = 6 constructor time (milliseconds) = 0 open time (milliseconds) = 0 next time (milliseconds) = 0 close time (milliseconds) = 0 optimizer estimated rowcount: 0.60 optimizer estimated cost: 372.68 Left result set: Table Scan ResultSet for CHANGESETS at read committed isolation level using instantaneous share row locking chosen by the optimizer Number of opens = 1 Rows seen = 3 Rows filtered = 0 Fetch Size = 16 constructor time (milliseconds) = 0 open time (milliseconds) = 0 next time (milliseconds) = 0 close time (milliseconds) = 0 next time in milliseconds/row = 0 scan information: Bit set of columns fetched={0, 1, 3} Number of columns fetched=3 Number of pagesvisited=1 Number of rows qualified=3 Number of rows visited=3 Scan type=heap start position: null stop position: null qualifiers: None optimizer estimated row count: 8.00 optimizer estimated cost: 351.84 Right result set: Hash Scan ResultSet for FILECHANGES using constraint FILECHANGES_FILE_CHANGESET at read committed isolation level using instantaneous share row locking: Number of opens = 3 Hash table size = 3 Hash key is column number 1 Rows seen = 6 Rows filtered = 0 constructor time (milliseconds) = 0 open time (milliseconds) = 0 next time (milliseconds) = 0 close time (milliseconds) = 0 next time in milliseconds/row = 0 scan information: Bit set of columns fetched={0, 1} Number of columns fetched=2 Number of deleted rows visited=0 Number of pagesvisited=1 Number of rows qualified=6 Number of rows visited=6 Scan type=btree Tree height=1 start position: None stop position: None scan qualifiers: None next qualifiers: Column[0][0] Id: 1 Operator: = Ordered nulls: false Unknown return value: false Negate comparison result: false optimizer estimated row count: 0.60 optimizer estimated cost: 20.84 Right result set: Project-Restrict ResultSet (10): Number of opens = 6 Rows seen = 6 Rows filtered = 0 restriction = true projection = true constructor time (milliseconds) = 0 open time (milliseconds) = 0 next time (milliseconds) = 0 close time (milliseconds) = 0 restriction time (milliseconds) = 0 projection time (milliseconds) = 0 optimizer estimated rowcount: 0.60 optimizer estimated cost: 81.39 Source result set: Index Row to Base Row ResultSet for REPOSITORIES: Number of opens = 6 Rows seen = 6 Columns accessed from heap = {1} constructor time (milliseconds) = 0 open time (milliseconds) = 0 next time (milliseconds) = 0 close time (milliseconds) = 0 optimizer estimated row count: 0.60 optimizer estimated cost: 81.39 Index Scan ResultSet for REPOSITORIES using constraint REPOSITORIES_PRIMARY_ID at read committed isolation level using share row locking chosen by the optimizer Number of opens= 6 Rows seen = 6 Rows filtered =0 Fetch Size = 1 constructor time (milliseconds) = 0 open time (milliseconds) = 0 next time (milliseconds) = 0 close time (milliseconds) = 0 next time in milliseconds/row = 0 scan information: Bit set of columns fetched=All Number of columns fetched=2 Number of deleted rows visited=0 Number of pages visited=6 Number of rows qualified=6 Number of rows visited=6 Scan type=btree Tree height=1 start position: >= on first 1 column(s). Ordered null semantics on the following columns:0 stop position: > on first 1 column(s). Ordered null semantics on the following columns:0 qualifiers: None optimizer estimated row count: 0.60 optimizer estimated cost: 81.39 Right result set: Project-Restrict ResultSet (13): Number of opens = 6 Rows seen = 6 Rows filtered = 0 restriction = true projection = true constructor time (milliseconds)= 0 open time (milliseconds) = 0 next time (milliseconds) = 0 close time (milliseconds) = 0 restriction time (milliseconds)= 0 projection time (milliseconds) = 0 optimizer estimated row count: 0.60 optimizer estimated cost: 21.50 Source result set: Index Row to Base Row ResultSet for FILES: Number of opens = 6 Rows seen = 6 Columns accessed from heap = {2} constructor time (milliseconds) = 0 open time (milliseconds) = 0 next time (milliseconds) = 0 close time (milliseconds) = 0 optimizer estimated rowcount: 0.60 optimizer estimated cost: 21.50 Index Scan ResultSet for FILES using constraint FILES_PRIMARY_ID at read committed isolation level using share row locking chosen by the optimizer Number of opens = 6 Rows seen = 6 Rows filtered = 0 Fetch Size = 1 constructor time (milliseconds) = 0 open time (milliseconds) = 0 next time (milliseconds) = 0 close time (milliseconds) = 0 next time in milliseconds/row = 0 scan information: Bit set of columns fetched=All Number of columns fetched=2 Number of deleted rows visited=0 Number of pages visited=6 Number of rows qualified=6 Number of rows visited=6 Scan type=btree Tree height=1 start position: >= on first 1 column(s). Ordered null semantics on the following columns:0 stop position: > on first 1 column(s). Ordered null semantics on the following columns:0 qualifiers:None optimizer estimated row count: 0.60 optimizer estimated cost: 21.50 Right result set: Project-Restrict ResultSet (16): Number of opens = 6 Rows seen = 6 Rows filtered = 0 restriction = true projection = true constructor time (milliseconds) = 0 open time (milliseconds) = 0 next time (milliseconds) = 0 close time (milliseconds) = 0 restriction time (milliseconds) = 0 projection time (milliseconds) = 0 optimizer estimated row count: 0.36 optimizer estimated cost: 71.37 Source result set: Index Row to Base Row ResultSet for AUTHORS: Number of opens = 6 Rows seen = 6 Columns accessed from heap = {0, 1} constructor time (milliseconds)= 0 open time (milliseconds) = 0 next time (milliseconds) = 0 close time (milliseconds) = 0 optimizer estimated row count: 0.36 optimizer estimated cost: 71.37 Index Scan ResultSet for AUTHORS using constraint AUTHORS_REPOSITORY_NAME at read committed isolation level using instantaneous share row locking chosen by the optimizer Number of opens = 6 Rows seen = 6 Rows filtered = 0 Fetch Size = 16 constructor time (milliseconds) = 0 open time (milliseconds) = 0 next time (milliseconds) = 0 close time (milliseconds) = 0 next time in milliseconds/row = 0 scan information: Bit set of columns fetched={0, 2} Number of columns fetched=2 Number of deleted rows visited=0 Number of pages visited=6 Number of rows qualified=6 Number of rows visited=6 Scan type=btree Tree height=1 start position: >= on first 1 column(s). Ordered null semantics on the following columns:0 stop position: > on first 1 column(s). Ordered null semantics on the following columns:0 qualifiers: Column[0][0] Id: 0 Operator: = Ordered nulls: false Unknown return value: false Negate comparison result: false optimizer estimated rowcount: 0.36 optimizer estimated cost: 71.37 Right result set: Project-Restrict ResultSet (19): Number of opens = 6 Rows seen = 24 Rows filtered = 18 restriction = true projection = true constructor time (milliseconds) = 0 open time (milliseconds) = 0 next time (milliseconds) = 0 close time (milliseconds) = 0 restriction time (milliseconds) = 0 projection time (milliseconds) = 0 optimizer estimated row count: 0.08 optimizer estimated cost: 6.73 Source result set: Index Row to Base Row ResultSet for FILES: Number of opens = 6 Rows seen = 24 Columns accessed from heap = {0, 1, 2} constructor time (milliseconds) = 0 open time (milliseconds) = 0 next time (milliseconds) = 0 close time (milliseconds) = 0 optimizer estimated row count: 0.08 optimizer estimated cost: 6.73 Index Scan ResultSet for FILES using constraint FILES_REPOSITORY_PATH at read committed isolation level using instantaneous share row locking chosen by the optimizer Number of opens = 6 Rows seen = 24 Rows filtered = 0 Fetch Size = 16 constructor time (milliseconds)= 0 open time (milliseconds) = 0 next time (milliseconds) = 0 close time (milliseconds) = 0 next time in milliseconds/row =0 scan information: Bit set of columns fetched=All Number of columns fetched=3 Number of deleted rows visited=0 Number of pages visited=6 Number of rows qualified=24 Number of rows visited=24 Scan type=btree Tree height=1 start position: >= on first 2 column(s). Ordered null semantics on the following columns:0 1 stop position: >= on first 2 column(s). Ordered null semantics on the following columns: 0 1 qualifiers: Column[0][0] Id: 0 Operator: = Ordered nulls: false Unknown return value: false Negate comparison result: false optimizer estimated row count: 0.08 optimizer estimated cost: 6.73 BTW, just an FYI, that I am doing my research on this bug in 10.5 codeline.
Fix affect versions. updated wrong issue.
The following updated query is still pretty involved but it consistently reproduces the wrong results and will go through very limited iteration of optimizer code because of all the optimizer overrides. Ideal will be if the following query can be made simpler with fewer number of tables.
SELECT CS.ID FROM --DERBY-PROPERTIES joinOrder=FIXED REPOSITORIES R -- DERBY-PROPERTIES constraint=REPOSITORIES_PATH , FILES F -- DERBY-PROPERTIES constraint=FILES_REPOSITORY_PATH , FILECHANGES FC -- DERBY-PROPERTIES constraint=FILECHANGES_FILE_CHANGESET , AUTHORS A -- DERBY-PROPERTIES constraint=AUTHORS_REPOSITORY_NAME , CHANGESETS CS -- DERBY-PROPERTIES constraint=CHANGESETS_PRIMARY_ID WHERE R.PATH = '/var/tmp/source5923202038296723704opengrok/mercurial' AND F.REPOSITORY = R.ID AND A.REPOSITORY = R.ID AND CS.REPOSITORY = R.ID AND CS.ID = FC.CHANGESET AND F.ID = FC.FILE AND A.ID = CS.AUTHOR AND EXISTS ( SELECT 1 FROM FILES F2 WHERE F2.ID = FC.FILE AND F2.REPOSITORY = R.ID AND F2.PATH LIKE '/%' ESCAPE '#') ORDER BY CS.ID DESC; i have been looking at the query plan and have more questions than answers.
One strange thing is that the bad query plan does not use the descending index at all. With would naturally satisfy a sort avoidance plan was used but in some part of the join order which actually didn't result in the row ordering being correct. The plan looks like it does some sort of query transformation, maybe flatten but not sure if that is right term. The EXISTS subquery I think is making it hard to control the plan with hints, but am not sure. If anyone can figure out a rewrite that reproduces the bug without the subquery that would help also. I tried the following query thinking just the existence of the subquery might be enough, but it got correct ordered results: SELECT CS.ID FROM CHANGESETS CS, FILECHANGES FC, REPOSITORIES R, FILES F, AUTHORS A WHERE R.PATH = '/var/tmp/source5923202038296723704opengrok/mercurial' AND F.REPOSITORY = R.ID AND A.REPOSITORY = R.ID AND CS.REPOSITORY = R.ID AND CS.ID = FC.CHANGESET AND F.ID = FC.FILE AND A.ID = CS.AUTHOR AND EXISTS ( SELECT 1 FROM FILES F2 WHERE F2.REPOSITORY = 1) ORDER BY CS.ID DESC; Here is a fixed order join query that repro's on trunk but does not repro on 10.5 before the 3926
changes. I shuffled the order of the tables in the from list to match order in the attached plan where the bug reproduced - nothing else is changed.: SELECT CS.ID FROM --DERBY-PROPERTIES joinOrder=FIXED REPOSITORIES R, FILES F, FILECHANGES FC, AUTHORS A, CHANGESETS CS WHERE R.PATH = '/var/tmp/source5923202038296723704opengrok/mercurial' AND F.REPOSITORY = R.ID AND A.REPOSITORY = R.ID AND CS.REPOSITORY = R.ID AND CS.ID = FC.CHANGESET AND F.ID = FC.FILE AND A.ID = CS.AUTHOR AND EXISTS ( SELECT 1 FROM FILES F2 WHERE F2.ID = FC.FILE AND F2.REPOSITORY = R.ID AND F2.PATH LIKE '/%' ESCAPE '#') ORDER BY CS.ID DESC; This query also gets out of order rows in trunk but not in 10.5 before 3629 change. It fixes the indexes used also:
SELECT CS.ID FROM --DERBY-PROPERTIES joinOrder=FIXED REPOSITORIES R --DERBY-PROPERTIES constraint=REPOSITORIES_PATH , FILES F --DERBY-PROPERTIES constraint=FILES_REPOSITORY_PATH , FILECHANGES FC --DERBY-PROPERTIES constraint=FILECHANGES_FILE_CHANGESET , AUTHORS A --DERBY-PROPERTIES constraint=AUTHORS_REPOSITORY_NAME , CHANGESETS CS --DERBY-PROPERTIES constraint=CHANGESETS_PRIMARY_ID WHERE R.PATH = '/var/tmp/source5923202038296723704opengrok/mercurial' AND F.REPOSITORY = R.ID AND A.REPOSITORY = R.ID AND CS.REPOSITORY = R.ID AND CS.ID = FC.CHANGESET AND F.ID = FC.FILE AND A.ID = CS.AUTHOR AND EXISTS ( SELECT 1 FROM FILES F2 --DERBY-PROPERTIES constraint=FILES_REPOSITORY_PATH WHERE F2.ID = FC.FILE AND F2.REPOSITORY = R.ID AND F2.PATH LIKE '/%' ESCAPE '#') ORDER BY CS.ID DESC; This query is just an existence of a easier subquery that failed on trunk but gives correct order on svn 783167:
ij> SELECT CS.ID FROM --DERBY-PROPERTIES joinOrder=FIXED REPOSITORIES R --DERBY-PROPERTIES constraint=REPOSITORIES_PATH , FILES F --DERBY-PROPERTIES constraint=FILES_REPOSITORY_PATH , FILECHANGES FC --DERBY-PROPERTIES constraint=FILECHANGES_FILE_CHANGESET , AUTHORS A --DERBY-PROPERTIES constraint=AUTHORS_REPOSITORY_NAME , CHANGESETS CS --DERBY-PROPERTIES constraint=CHANGESETS_PRIMARY_ID WHERE R.PATH = '/var/tmp/source5923202038296723704opengrok/mercurial' AND F.REPOSITORY = R.ID AND A.REPOSITORY = R.ID AND CS.REPOSITORY = R.ID AND CS.ID = FC.CHANGESET AND F.ID = FC.FILE AND A.ID = CS.AUTHOR AND EXISTS ( SELECT 1 FROM FILES F2 --DERBY-PROPERTIES constraint=FILES_REPOSITORY_PATH WHERE F2.ID = FC.FILE ) ORDER BY CS.ID DESC; ID ----------- 1 2 3 2 2 3 6 rows selected Thanks, Lily Taking out Author table from previous query still not ordering in trunk but works 10.5 (783167)
ij> SELECT CS.ID FROM --DERBY-PROPERTIES joinOrder=FIXED REPOSITORIES R --DERBY-PROPERTIES constraint=REPOSITORIES_PATH , FILES F --DERBY-PROPERTIES constraint=FILES_REPOSITORY_PATH , FILECHANGES FC --DERBY-PROPERTIES constraint=FILECHANGES_FILE_CHANGESET , CHANGESETS CS --DERBY-PROPERTIES constraint=CHANGESETS_PRIMARY_ID WHERE R.PATH = '/var/tmp/source5923202038296723704opengrok/mercurial' AND F.REPOSITORY = R.ID AND CS.REPOSITORY = R.ID AND CS.ID = FC.CHANGESET AND F.ID = FC.FILE AND EXISTS ( SELECT 1 FROM FILES F2 --DERBY-PROPERTIES constraint=FILES_REPOSITORY_PATH WHERE F2.ID = FC.FILE) ORDER BY CS.ID DESC; ID ----------- 1 2 3 2 2 3 If take out of 'CS.ID = FC.CHANGESET' from where cause constraint, the query will perform order by on both trunk and 10.5(783167) ij> SELECT CS.ID FROM --DERBY-PROPERTIES joinOrder=FIXED REPOSITORIES R --DERBY-PROPERTIES constraint=REPOSITORIES_PATH , FILES F --DERBY-PROPERTIES constraint=FILES_REPOSITORY_PATH , FILECHANGES FC --DERBY-PROPERTIES constraint=FILECHANGES_FILE_CHANGESET , CHANGESETS CS --DERBY-PROPERTIES constraint=CHANGESETS_PRIMARY_ID WHERE R.PATH = '/var/tmp/source5923202038296723704opengrok/mercurial' AND F.REPOSITORY = R.ID AND CS.REPOSITORY = R.ID AND F.ID = FC.FILE AND EXISTS ( SELECT 1 FROM FILES F2 --DERBY-PROPERTIES constraint=FILES_REPOSITORY_PATH WHERE F2.ID = FC.FILE) ORDER BY CS.ID DESC; ID ----------- 3 3 3 3 3 3 2 2 2 2 2 2 1 1 1 1 1 1 18 rows selected I cannot really tell what is the difference between the two path. However, the value -1 for Tree height on the first query seems odd to me. The derby.log for the first query is on notorderby4331.zip and the second query is on orderby4331.zip. (The below capture from derby.log on notorderby4331.zip) User supplied optimizer overrides on REPOSITORIES are { index=SQL090803170751640 } Index Scan ResultSet for REPOSITORIES using constraint REPOSITORIES_PATH at read committed isolation level using share row locking chosen by the optimizer Number of opens = 1 Rows seen = 1 Rows filtered = 0 Fetch Size = 1 constructor time (milliseconds) = 0 open time (milliseconds) = 0 next time (milliseconds) = 0 close time (milliseconds) = 0 next time in milliseconds/row = 0 scan information: Bit set of columns fetched=All Number of columns fetched=2 Number of deleted rows visited=0 Number of pages visited=1 Number of rows qualified=1 Number of rows visited=1 Scan type=btree Tree height=-1 <<<==== -1 is an odd value start position: >= on first 1 column(s). (The below capture from derby.log on orderby4331.zip) User supplied optimizer overrides on REPOSITORIES are { index=SQL090803170751640 } Index Scan ResultSet for REPOSITORIES using constraint REPOSITORIES_PATH at read committed isolation level using share row locking chosen by the optimizer Number of opens = 1 Rows seen = 1 Rows filtered = 0 Fetch Size = 1 constructor time (milliseconds) = 0 open time (milliseconds) = 0 next time (milliseconds) = 0 close time (milliseconds) = 0 next time in milliseconds/row = 0 scan information: Bit set of columns fetched=All Number of columns fetched=2 Number of deleted rows visited=0 Number of pages visited=1 Number of rows qualified=1 Number of rows visited=1 Scan type=btree Tree height=1 <<<==== not -1 value start position: >= on first 1 column(s). Hope this info is helpful. Thanks, Lily The following query also seems to give wrong results on 10.5 codeline. It has tables REPOSITORIES and AUTHORS missing compared to the original query. I will try it without the changes for
SELECT CS.ID FROM --DERBY-PROPERTIES joinOrder=FIXED FILES F --DERBY-PROPERTIES constraint=FILES_REPOSITORY_PATH , FILECHANGES FC --DERBY-PROPERTIES constraint=FILECHANGES_FILE_CHANGESET , CHANGESETS CS --DERBY-PROPERTIES constraint=CHANGESETS_PRIMARY_ID WHERE CS.ID = FC.CHANGESET AND F.ID = FC.FILE AND EXISTS ( SELECT 1 FROM FILES F2 --DERBY-PROPERTIES constraint=FILES_REPOSITORY_PATH WHERE F2.ID = FC.FILE) ORDER BY CS.ID DESC; ID ----------- 1 2 3 2 2 3 6 rows selected The above query works fine prior to
ID ----------- 3 3 2 2 2 1 6 rows selected Attaching a new simpler script, createDBsimpler.txt, to create the db which will have only the tables used by the query below
SELECT CS.ID FROM --DERBY-PROPERTIES joinOrder=FIXED FILES F --DERBY-PROPERTIES constraint=FILES_REPOSITORY_PATH , FILECHANGES FC --DERBY-PROPERTIES constraint=FILECHANGES_FILE_CHANGESET , CHANGESETS CS --DERBY-PROPERTIES constraint=CHANGESETS_PRIMARY_ID WHERE CS.ID = FC.CHANGESET AND F.ID = FC.FILE AND EXISTS ( SELECT 1 FROM FILES F2 --DERBY-PROPERTIES constraint=FILES_REPOSITORY_PATH WHERE F2.ID = FC.FILE) ORDER BY CS.ID DESC; I have simplified the db creation just a little bit more removing the unnecessary columns from the tables. The new create db script is createDBsimplerVer2.txt and the query that goes along with that db is
SELECT CS.ID FROM --DERBY-PROPERTIES joinOrder=FIXED FILES F --DERBY-PROPERTIES constraint=FILES_REPOSITORY_PATH , FILECHANGES FC --DERBY-PROPERTIES constraint=FILECHANGES_FILE_CHANGESET , CHANGESETS CS --DERBY-PROPERTIES constraint=CHANGESETS_PRIMARY_ID WHERE CS.ID = FC.CHANGESET AND F.ID = FC.FILE AND EXISTS ( SELECT 1 FROM FILES F2 --DERBY-PROPERTIES constraint=FILES_REPOSITORY_PATH WHERE F2.ID = FC.FILE) ORDER BY CS.ID DESC; The query plan for the query below is as follows
SELECT CS.ID FROM --DERBY-PROPERTIES joinOrder=FIXED FILES F --DERBY-PROPERTIES constraint=FILES_REPOSITORY_PATH , FILECHANGES FC --DERBY-PROPERTIES constraint=FILECHANGES_FILE_CHANGESET , CHANGESETS CS --DERBY-PROPERTIES constraint=CHANGESETS_PRIMARY_ID WHERE CS.ID = FC.CHANGESET AND F.ID = FC.FILE AND EXISTS ( SELECT 1 FROM FILES F2 --DERBY-PROPERTIES constraint=FILES_REPOSITORY_PATH WHERE F2.ID = FC.FILE) ORDER BY CS.ID DESC; Statement Name: null Statement Text: SELECT CS.ID FROM --DERBY-PROPERTIES joinOrder=FIXED FILES F --DERBY-PROPERTIES constraint=FILES_REPOSITORY_PATH , FILECHANGES FC --DERBY-PROPERTIES constraint=FILECHANGES_FILE_CHANGESET , CHANGESETS CS --DERBY-PROPERTIES constraint=CHANGESETS_PRIMARY_ID WHERE CS.ID = FC.CHANGESET AND F.ID = FC.FILE AND EXISTS ( SELECT 1 FROM FILES F2 --DERBY-PROPERTIES constraint=FILES_REPOSITORY_PATH WHERE F2.ID = FC.FILE) ORDER BY CS.ID DESC Parse Time: 0 Bind Time: 0 Optimize Time: 0 Generate Time: 0 Compile Time: 0 Execute Time: 0 Begin Compilation Timestamp : null End Compilation Timestamp : null Begin Execution Timestamp : null End Execution Timestamp : null Statement Execution Plan Text: Project-Restrict ResultSet (11): Number of opens = 1 Rows seen = 6 Rows filtered = 0 restriction = false projection = true constructor time (milliseconds) = 0 open time (milliseconds) = 0 next time (milliseconds) = 0 close time (milliseconds) = 0 restriction time (milliseconds) = 0 projection time (milliseconds) = 0 optimizer estimated row count: 2.67 optimizer estimated cost: 2330.17 Source result set: User supplied optimizer overrides for join are { joinOrder=FIXED } Nested Loop Join ResultSet: Number of opens = 1 Rows seen from the left = 6 Rows seen from the right = 6 Rows filtered = 0 Rows returned = 6 constructor time (milliseconds) = 0 open time (milliseconds) = 0 next time (milliseconds) = 0 close time (milliseconds) = 0 optimizer estimated row count: 2.67 optimizer estimated cost: 2330.17 Left result set: User supplied optimizer overrides for join are { joinOrder=FIXED} Nested Loop Exists Join ResultSet: Number of opens = 1 Rows seen from the left = 6 Rows seen from the right = 6 Rows filtered = 0 Rows returned = 6 constructor time (milliseconds) = 0 open time (milliseconds) = 0 next time (milliseconds) = 0 close time (milliseconds) = 0 optimizer estimated row count: 6.00 optimizer estimated cost: 357.23 Left result set: User supplied optimizer overrides for join are { joinOrder=FIXED } Nested Loop Join ResultSet: Number of opens = 1 Rows seen from the left = 4 Rows seen from the right = 6 Rows filtered = 0 Rows returned = 6 constructor time (milliseconds) = 0 open time (milliseconds) = 0 next time (milliseconds) = 0 close time (milliseconds) = 0 optimizer estimated row count: 6.00 optimizer estimated cost: 347.80 Left result set: Index Row to Base Row ResultSet for FILES: Number of opens = 1 Rows seen = 4 Columns accessed from heap = {0} constructor time (milliseconds) = 0 open time (milliseconds) = 0 next time (milliseconds) = 0 close time (milliseconds) = 0 optimizer estimated row count: 9.00 optimizer estimated cost: 328.82 User supplied optimizer overrides on FILES are { index=SQL090803230333481 } Index Scan ResultSet for FILES using constraint FILES_REPOSITORY_PATH at read committed isolation level using share row locking chosen by the optimizer Number of opens = 1 Rows seen = 4 Rows filtered = 0 Fetch Size = 1 constructor time (milliseconds)= 0 open time (milliseconds) = 0 next time (milliseconds) = 0 close time (milliseconds) = 0 next time in milliseconds/row =0 scan information: Bit set of columns fetched={2} Number of columns fetched=1 Number of deleted rows visited=0 Number of pages visited=1 Number of rows qualified=4 Number of rows visited=4 Scan type=btree Tree height=-1 start position: None stop position: None qualifiers: None optimizer estimated row count: 9.00 optimizer estimated cost: 328.82 Right result set: User supplied optimizer overrides on FILECHANGES are { index=SQL090803230334101 } Index Scan ResultSet for FILECHANGES using constraint FILECHANGES_FILE_CHANGESET at read committed isolation level using instantaneous share row locking chosen by the optimizer Number of opens = 4 Rows seen = 6 Rows filtered = 0 Fetch Size = 16 constructor time (milliseconds) = 0 open time (milliseconds) = 0 next time (milliseconds) = 0 close time (milliseconds) = 0 next time in milliseconds/row = 0 scan information: Bit set of columns fetched={0, 1} Number of columns fetched=2 Number of deleted rows visited=0 Number of pages visited=4 Number of rows qualified=6 Number of rows visited=9 Scan type=btree Tree height=1 start position: >= on first 1 column(s). Ordered null semantics on the following columns:0 stop position: > on first 1 column(s). Ordered null semantics on the following columns:0 qualifiers:None optimizer estimated row count: 6.00 optimizer estimated cost: 18.97 Right result set: User supplied optimizer overrides on CHANGESETS are { index=SQL090803230333820 } Index Scan ResultSet for CHANGESETS using constraint CHANGESETS_PRIMARY_ID at read committed isolation level using share row locking chosen by the optimizer Number of opens = 6 Rows seen = 6 Rows filtered = 0 Fetch Size = 1 constructor time (milliseconds) = 0 open time (milliseconds) = 0 next time (milliseconds) = 0 close time (milliseconds) = 0 next time in milliseconds/row = 0 scan information: Bit set of columns fetched={0} Number of columns fetched=1 Number of deleted rows visited=0 Number of pages visited=6 Number of rows qualified=6 Number of rows visited=6 Scan type=btree Tree height=1 start position: >= on first 1 column(s). Ordered null semantics on the following columns:0 stop position: > on first 1 column(s). Ordered null semantics on the following columns:0 qualifiers:None optimizer estimated row count: 6.00 optimizer estimated cost: 9.43 Right result set: Project-Restrict ResultSet (10): Number of opens = 6 Rows seen = 24 Rows filtered = 18 restriction = true projection = true constructor time (milliseconds) = 0 open time (milliseconds) = 0 next time (milliseconds) = 0 close time (milliseconds) = 0 restriction time (milliseconds) = 0 projection time (milliseconds) = 0 optimizer estimated row count: 2.67 optimizer estimated cost: 1972.94 Source result set: Index Row to Base Row ResultSet for FILES: Number of opens = 6 Rows seen = 24 Columns accessed from heap = {0} constructor time (milliseconds) = 0 open time (milliseconds) = 0 next time (milliseconds) = 0 close time (milliseconds) = 0 optimizer estimated row count: 2.67 optimizer estimated cost: 1972.94 User supplied optimizer overrides on FILES are { index=SQL090803230333481 } Index Scan ResultSet for FILES using constraint FILES_REPOSITORY_PATH at read committed isolation level using share row locking chosen by the optimizer Number of opens = 6 Rows seen = 24 Rows filtered = 0 Fetch Size = 1 constructor time (milliseconds) = 0 open time (milliseconds) = 0 next time (milliseconds) = 0 close time (milliseconds) = 0 next time in milliseconds/row = 0 scan information: Bit set of columns fetched={2} Number of columns fetched=1 Number of deleted rows visited=0 Number of pages visited=6 Number of rows qualified=24 Number of rows visited=24 Scan type=btree Tree height=1 start position: None stop position: None qualifiers:None optimizer estimated row count: 2.67 optimizer estimated cost: 1972.94 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||