Issue Details (XML | Word | Printable)

Key: DERBY-3023
Type: Bug Bug
Status: Resolved Resolved
Resolution: Fixed
Priority: Major Major
Assignee: A B
Reporter: Stefan Cordes
Votes: 0
Watchers: 0
Operations

If you were logged in you would be able to see more operations.
Derby

Different result rows depending on the sequence of INNER JOIN and OUTER JOIN

Created: 23/Aug/07 08:32 AM   Updated: 30/Jun/09 04:12 PM
Return to search
Component/s: SQL
Affects Version/s: 10.1.2.1, 10.2.2.0, 10.3.1.4
Fix Version/s: 10.1.3.2, 10.2.2.1, 10.3.3.0, 10.4.1.3

Time Tracking:
Not Specified

File Attachments:
  Size
Text File Licensed for inclusion in ASF works d3023_notTested_v1.patch 2007-11-18 01:31 AM A B 10 kB
File Licensed for inclusion in ASF works d3023_repro.sql 2007-11-18 01:31 AM A B 1.0 kB
Text File Licensed for inclusion in ASF works d3023_v2.patch 2007-11-19 07:49 PM A B 10 kB
Zip Archive derby-02-search-joins.zip 2007-08-23 08:33 AM Stefan Cordes 528 kB
Zip Archive derby-02-search-joins2.zip 2007-08-29 12:16 PM Stefan Cordes 3.67 MB
Java Source File Licensed for inclusion in ASF works DerbySearchJoins.java 2007-12-11 02:00 PM Stefan Cordes 36 kB
HTML File Licensed for inclusion in ASF works releaseNote.html 2008-04-02 01:20 PM Rick Hillegas 4 kB
HTML File Licensed for inclusion in ASF works releaseNote.html 2008-04-02 03:31 AM A B 5 kB
Zip Archive Licensed for inclusion in ASF works RUNTIMESTATISTICS-10.3.zip 2007-08-23 08:32 AM Stefan Cordes 19 kB
Zip Archive Licensed for inclusion in ASF works Statement10.3.1.4 - (561794)-j1.4.2_10.zip 2007-08-29 12:18 PM Stefan Cordes 530 kB
Environment: Windows XP, Java 1.4.2
Issue Links:
Incorporates
 
Reference
 

Issue & fix info: Release Note Needed
Resolution Date: 16/Jan/08 04:52 PM


 Description  « Hide
We have a complex SQL joining 11 Tables via INNER JOIN and OUTER JOIN.
These SQLs were tested against an z/OS DB2 Version 8.
After moving to our local platform with Derby we found out the resultsets returned by the SQLs were too less.

I tested our "old style" SQL which results in 889 rows.
Our new style SQL expected to give similar rows but gives *0*.

After some work we found a workaround: first place all the "INNER JOIN"s in the SQL and then the "OUTER JOIN"s.

{code:title=Result of testprogram}
Derby=10.3-b561794
Test 10.3-b561794-old-style-sql
889 Rows in 1703ms
Test 10.3-b561794-new-style-sql
0 Rows in 563ms _(expected 924 rows instead)_
Test 10.3-b561794-new-style-sql-only-inner
2 Rows in 766ms _(only inner joins, no outer joins but larger result)_
Test 10.3-b561794-new-style-sql_first-inner-joins
924 Rows in 578ms
Test 10.3-b561794-new-style-sql_without-condition
924 Rows in 438ms
{code}

Here our initial used SQL:
{code:title=SQL giving wrong result (0 rows)}
SELECT O4Work.ESVN01.NU_BUY_CPY AS PO_BuyCompanyNo, O4Work.ESVN01.NU_ODR AS PO_Number, O4Work.ESVN01.FL_ODR_CAE AS PO_Type, O4Work.ESVN01.NU_MCS_SPY AS PO_SupplierNo, O4Work.ESVN01.NU_ST3 AS PO_StatusNo, O4Work.ESVN01.DA_SPY_COY_PRT AS PO_SCPrintDate, O4Work.ESVN01.FL_SAS AS PO_SeasFlag, CASE WHEN (SELECT COUNT(O4Work.ESVNA5.ID_PTE) FROM O4Work.ESVNA5 WHERE O4Work.ESVN02.NU_BUY_CPY = O4Work.ESVNA5.NU_BUY_CPY AND O4Work.ESVN02.NU_ODR = O4Work.ESVNA5.NU_ODR AND O4Work.ESVN02.NU_PST = O4Work.ESVNA5.NU_PST) = 0 THEN 'N' ELSE 'Y' END AS POPA_PictureID, CASE WHEN (SELECT COUNT(O4Work.ESVNG3.NU_ODR) FROM O4Work.ESVNG3 WHERE O4Work.ESVN01.NU_BUY_CPY = O4Work.ESVNG3.NU_BUY_CPY AND O4Work.ESVN01.NU_ODR = O4Work.ESVNG3.NU_ODR) = 0 THEN 'N' ELSE 'Y' END AS ON_ID, O4Work.ESVN02.NU_PST AS POP_Position_Id, O4Work.ESVN02.NU_CTT AS POP_ContractNo, O4Work.ESVN02.NU_ARO_CTT AS POP_ArosContractNo, O4Work.ESVN02.NU_ST3 AS POP_StatusNo, O4Work.ESVN02.DA_CAE AS POP_CreationDate, O4Work.ESVN02.DA_LAT_AMD AS POP_LastAmendDate, O4Work.ESVNA0.NU_SSN_IDE AS POPD_SeasonInd, O4Work.ESVNA0.NU_STL_ID1 AS POPD_StyleId, O4Work.ESVNA0.NU_SRY_ID1 AS POPD_StoryID, O4Work.ESVNA0.NU_LC1 AS POPD_LicenseID, O4Work.ESVP00.NU_CSY AS SER_ClassNo, O4Work.ESVP00.NU_COE AS SER_CodeNo, O4Work.ESVP00.NU_SRL AS SER_SerialNo, O4Work.ESVP00.NU_PIK_MOD AS SER_PickingM, O4Work.ESVN03.NU_MT1_CPY AS POPC_MasterCpyNo, O4Work.ESVN03.QU_ODR AS POPC_OrderedQty, O4Work.ESVN03.DA_EDD AS POPC_Edd, O4Work.ESVN03.DA_LDD AS POPC_Ldd, O4Work.ESVN03.DA_PAD AS POPC_Pad, O4Work.ESVN03.DA_SAD AS POPC_Sad, O4Work.ESVN03.PR_SCP AS POPC_SupCstPrice, O4Work.ESVN03.NU_SCP_CR1 AS POPC_SupCstPrCurr, O4Work.ESVN03.NU_ST3 AS POPC_StatusNo, O4Work.ESVN03.NU_COY_FRM_ODR AS POPC_Src_PO_Number, O4Work.ESVN03.NU_COY_UTL_ODR AS POPC_Tgt_PO_Number, O4Work.ESVN03.DA_FLR_RDY AS POPC_FRM_DATE, O4Work.ESVN03.FL_CSG AS POPC_CS_FLAG, O4Work.ESVN03.NU_PAK_MOD_SPY AS POPC_PackSupplNo, O4Work.ESVN03.NU_PAK_MOD_DCR AS POPC_PackingDCNo, O4Work.ESVN03.NU_PS2_MOD AS POPC_PresMethodNo, O4Work.ESVN04.NU_RTL_CPY AS POPRC_RetailCode, O4Work.ESVN04.PR_PLN_SEL AS POPRC_SellPrice, O4Work.ESVN04.NU_PLN_SEL_PRC_CR1 AS POPRC_SellPrCurr, O4Work.ESVN08.NU_AVE AS POPRCA_AdvertNo, O4Work.ESVQ00.ID_SHP AS SHP_ShippingID, O4Work.ESVQ00.NU_SHP AS SHP_ShippingNo, O4Work.ESVNB0.NU_NTL_PDE_ID1 AS POPDC_NationalID, O4Work.ESVNB0.NU_EQP AS POPDC_EquipNumber, O4Work.ESVNE1.PE_OMU AS POPCC_OMU, CASE WHEN (SELECT COUNT(O4Work.ESVN07.NU_ODR) FROM O4Work.ESVN07 WHERE O4Work.ESVN03.NU_BUY_CPY = O4Work.ESVN07.NU_BUY_CPY AND O4Work.ESVN03.NU_MT1_CPY = O4Work.ESVN07.NU_MT1_CPY AND O4Work.ESVN03.NU_ODR = O4Work.ESVN07.NU_ODR AND O4Work.ESVN03.NU_PST = O4Work.ESVN07.NU_PST AND O4Work.ESVN07.FL_ALE_RMK = 'Y') = 0 THEN 'N' ELSE 'Y' END AS POPCU_AllocRem FROM O4Work.ESVN02
INNER JOIN O4Work.ESVN01 ON O4Work.ESVN02.NU_BUY_CPY = O4Work.ESVN01.NU_BUY_CPY AND O4Work.ESVN02.NU_ODR = O4Work.ESVN01.NU_ODR
INNER JOIN O4Work.ESVNA0 ON O4Work.ESVN02.NU_BUY_CPY = O4Work.ESVNA0.NU_BUY_CPY AND O4Work.ESVN02.NU_ODR = O4Work.ESVNA0.NU_ODR AND O4Work.ESVN02.NU_PST = O4Work.ESVNA0.NU_PST
INNER JOIN O4Work.ESVP00 ON O4Work.ESVNA0.ID_SRL = O4Work.ESVP00.ID_SRL
LEFT OUTER JOIN O4Work.ESVNA4 ON O4Work.ESVNA0.NU_BUY_CPY = O4Work.ESVNA4.NU_BUY_CPY AND O4Work.ESVNA0.NU_ODR = O4Work.ESVNA4.NU_ODR AND O4Work.ESVNA0.NU_PST = O4Work.ESVNA4.NU_PST
INNER JOIN O4Work.ESVN03 ON O4Work.ESVN02.NU_BUY_CPY = O4Work.ESVN03.NU_BUY_CPY AND O4Work.ESVN02.NU_ODR = O4Work.ESVN03.NU_ODR AND O4Work.ESVN02.NU_PST = O4Work.ESVN03.NU_PST
LEFT OUTER JOIN O4Work.ESVN04 ON O4Work.ESVN03.NU_BUY_CPY = O4Work.ESVN04.NU_BUY_CPY AND O4Work.ESVN03.NU_MT1_CPY = O4Work.ESVN04.NU_MT1_CPY AND O4Work.ESVN03.NU_ODR = O4Work.ESVN04.NU_ODR AND O4Work.ESVN03.NU_PST = O4Work.ESVN04.NU_PST
LEFT OUTER JOIN O4Work.ESVN08 ON O4Work.ESVN04.NU_BUY_CPY = O4Work.ESVN08.NU_BUY_CPY AND O4Work.ESVN04.NU_RTL_CPY = O4Work.ESVN08.NU_RTL_CPY AND O4Work.ESVN04.NU_MT1_CPY = O4Work.ESVN08.NU_MT1_CPY AND O4Work.ESVN04.NU_ODR = O4Work.ESVN08.NU_ODR AND O4Work.ESVN04.NU_PST = O4Work.ESVN08.NU_PST
INNER JOIN O4Work.ESVQ00 ON O4Work.ESVN03.ID_SHP = O4Work.ESVQ00.ID_SHP
INNER JOIN O4Work.ESVNB0 ON O4Work.ESVN03.NU_BUY_CPY = O4Work.ESVNB0.NU_BUY_CPY AND O4Work.ESVN03.NU_MT1_CPY = O4Work.ESVNB0.NU_MT1_CPY AND O4Work.ESVN03.NU_ODR = O4Work.ESVNB0.NU_ODR AND O4Work.ESVN03.NU_PST = O4Work.ESVNB0.NU_PST
INNER JOIN O4Work.ESVNE1 ON O4Work.ESVN03.NU_BUY_CPY = O4Work.ESVNE1.NU_BUY_CPY AND O4Work.ESVN03.NU_MT1_CPY = O4Work.ESVNE1.NU_MT1_CPY AND O4Work.ESVN03.NU_ODR = O4Work.ESVNE1.NU_ODR AND O4Work.ESVN03.NU_PST = O4Work.ESVNE1.NU_PST
WHERE O4Work.ESVN01.NU_BUY_CPY = 99 AND O4Work.ESVNA0.NU_CPY_GRP = 0
{code}
and the one with moved inner joins:
{code:title=SQL giving correct result (924 rows)}
SELECT O4Work.ESVN01.NU_BUY_CPY AS PO_BuyCompanyNo, O4Work.ESVN01.NU_ODR AS PO_Number, O4Work.ESVN01.FL_ODR_CAE AS PO_Type, O4Work.ESVN01.NU_MCS_SPY AS PO_SupplierNo, O4Work.ESVN01.NU_ST3 AS PO_StatusNo, O4Work.ESVN01.DA_SPY_COY_PRT AS PO_SCPrintDate, O4Work.ESVN01.FL_SAS AS PO_SeasFlag, CASE WHEN (SELECT COUNT(O4Work.ESVNA5.ID_PTE) FROM O4Work.ESVNA5 WHERE O4Work.ESVN02.NU_BUY_CPY = O4Work.ESVNA5.NU_BUY_CPY AND O4Work.ESVN02.NU_ODR = O4Work.ESVNA5.NU_ODR AND O4Work.ESVN02.NU_PST = O4Work.ESVNA5.NU_PST) = 0 THEN 'N' ELSE 'Y' END AS POPA_PictureID, CASE WHEN (SELECT COUNT(O4Work.ESVNG3.NU_ODR) FROM O4Work.ESVNG3 WHERE O4Work.ESVN01.NU_BUY_CPY = O4Work.ESVNG3.NU_BUY_CPY AND O4Work.ESVN01.NU_ODR = O4Work.ESVNG3.NU_ODR) = 0 THEN 'N' ELSE 'Y' END AS ON_ID, O4Work.ESVN02.NU_PST AS POP_Position_Id, O4Work.ESVN02.NU_CTT AS POP_ContractNo, O4Work.ESVN02.NU_ARO_CTT AS POP_ArosContractNo, O4Work.ESVN02.NU_ST3 AS POP_StatusNo, O4Work.ESVN02.DA_CAE AS POP_CreationDate, O4Work.ESVN02.DA_LAT_AMD AS POP_LastAmendDate, O4Work.ESVNA0.NU_SSN_IDE AS POPD_SeasonInd, O4Work.ESVNA0.NU_STL_ID1 AS POPD_StyleId, O4Work.ESVNA0.NU_SRY_ID1 AS POPD_StoryID, O4Work.ESVNA0.NU_LC1 AS POPD_LicenseID, O4Work.ESVP00.NU_CSY AS SER_ClassNo, O4Work.ESVP00.NU_COE AS SER_CodeNo, O4Work.ESVP00.NU_SRL AS SER_SerialNo, O4Work.ESVP00.NU_PIK_MOD AS SER_PickingM, O4Work.ESVN03.NU_MT1_CPY AS POPC_MasterCpyNo, O4Work.ESVN03.QU_ODR AS POPC_OrderedQty, O4Work.ESVN03.DA_EDD AS POPC_Edd, O4Work.ESVN03.DA_LDD AS POPC_Ldd, O4Work.ESVN03.DA_PAD AS POPC_Pad, O4Work.ESVN03.DA_SAD AS POPC_Sad, O4Work.ESVN03.PR_SCP AS POPC_SupCstPrice, O4Work.ESVN03.NU_SCP_CR1 AS POPC_SupCstPrCurr, O4Work.ESVN03.NU_ST3 AS POPC_StatusNo, O4Work.ESVN03.NU_COY_FRM_ODR AS POPC_Src_PO_Number, O4Work.ESVN03.NU_COY_UTL_ODR AS POPC_Tgt_PO_Number, O4Work.ESVN03.DA_FLR_RDY AS POPC_FRM_DATE, O4Work.ESVN03.FL_CSG AS POPC_CS_FLAG, O4Work.ESVN03.NU_PAK_MOD_SPY AS POPC_PackSupplNo, O4Work.ESVN03.NU_PAK_MOD_DCR AS POPC_PackingDCNo, O4Work.ESVN03.NU_PS2_MOD AS POPC_PresMethodNo, O4Work.ESVN04.NU_RTL_CPY AS POPRC_RetailCode, O4Work.ESVN04.PR_PLN_SEL AS POPRC_SellPrice, O4Work.ESVN04.NU_PLN_SEL_PRC_CR1 AS POPRC_SellPrCurr, O4Work.ESVN08.NU_AVE AS POPRCA_AdvertNo, O4Work.ESVQ00.ID_SHP AS SHP_ShippingID, O4Work.ESVQ00.NU_SHP AS SHP_ShippingNo, O4Work.ESVNB0.NU_NTL_PDE_ID1 AS POPDC_NationalID, O4Work.ESVNB0.NU_EQP AS POPDC_EquipNumber, O4Work.ESVNE1.PE_OMU AS POPCC_OMU, CASE WHEN (SELECT COUNT(O4Work.ESVN07.NU_ODR) FROM O4Work.ESVN07 WHERE O4Work.ESVN03.NU_BUY_CPY = O4Work.ESVN07.NU_BUY_CPY AND O4Work.ESVN03.NU_MT1_CPY = O4Work.ESVN07.NU_MT1_CPY AND O4Work.ESVN03.NU_ODR = O4Work.ESVN07.NU_ODR AND O4Work.ESVN03.NU_PST = O4Work.ESVN07.NU_PST AND O4Work.ESVN07.FL_ALE_RMK = 'Y') = 0 THEN 'N' ELSE 'Y' END AS POPCU_AllocRem FROM O4Work.ESVN02
INNER JOIN O4Work.ESVN01 ON O4Work.ESVN02.NU_BUY_CPY = O4Work.ESVN01.NU_BUY_CPY AND O4Work.ESVN02.NU_ODR = O4Work.ESVN01.NU_ODR
INNER JOIN O4Work.ESVNA0 ON O4Work.ESVN02.NU_BUY_CPY = O4Work.ESVNA0.NU_BUY_CPY AND O4Work.ESVN02.NU_ODR = O4Work.ESVNA0.NU_ODR AND O4Work.ESVN02.NU_PST = O4Work.ESVNA0.NU_PST
INNER JOIN O4Work.ESVP00 ON O4Work.ESVNA0.ID_SRL = O4Work.ESVP00.ID_SRL
INNER JOIN O4Work.ESVN03 ON O4Work.ESVN02.NU_BUY_CPY = O4Work.ESVN03.NU_BUY_CPY AND O4Work.ESVN02.NU_ODR = O4Work.ESVN03.NU_ODR AND O4Work.ESVN02.NU_PST = O4Work.ESVN03.NU_PST
INNER JOIN O4Work.ESVNB0 ON O4Work.ESVN03.NU_BUY_CPY = O4Work.ESVNB0.NU_BUY_CPY AND O4Work.ESVN03.NU_MT1_CPY = O4Work.ESVNB0.NU_MT1_CPY AND O4Work.ESVN03.NU_ODR = O4Work.ESVNB0.NU_ODR AND O4Work.ESVN03.NU_PST = O4Work.ESVNB0.NU_PST
INNER JOIN O4Work.ESVQ00 ON O4Work.ESVN03.ID_SHP = O4Work.ESVQ00.ID_SHP
LEFT OUTER JOIN O4Work.ESVNA4 ON O4Work.ESVNA0.NU_BUY_CPY = O4Work.ESVNA4.NU_BUY_CPY AND O4Work.ESVNA0.NU_ODR = O4Work.ESVNA4.NU_ODR AND O4Work.ESVNA0.NU_PST = O4Work.ESVNA4.NU_PST
LEFT OUTER JOIN O4Work.ESVN04 ON O4Work.ESVN03.NU_BUY_CPY = O4Work.ESVN04.NU_BUY_CPY AND O4Work.ESVN03.NU_MT1_CPY = O4Work.ESVN04.NU_MT1_CPY AND O4Work.ESVN03.NU_ODR = O4Work.ESVN04.NU_ODR AND O4Work.ESVN03.NU_PST = O4Work.ESVN04.NU_PST
LEFT OUTER JOIN O4Work.ESVN08 ON O4Work.ESVN04.NU_BUY_CPY = O4Work.ESVN08.NU_BUY_CPY AND O4Work.ESVN04.NU_RTL_CPY = O4Work.ESVN08.NU_RTL_CPY AND O4Work.ESVN04.NU_MT1_CPY = O4Work.ESVN08.NU_MT1_CPY AND O4Work.ESVN04.NU_ODR = O4Work.ESVN08.NU_ODR AND O4Work.ESVN04.NU_PST = O4Work.ESVN08.NU_PST
INNER JOIN O4Work.ESVNE1 ON O4Work.ESVN03.NU_BUY_CPY = O4Work.ESVNE1.NU_BUY_CPY AND O4Work.ESVN03.NU_MT1_CPY = O4Work.ESVNE1.NU_MT1_CPY AND O4Work.ESVN03.NU_ODR = O4Work.ESVNE1.NU_ODR AND O4Work.ESVN03.NU_PST = O4Work.ESVNE1.NU_PST
WHERE O4Work.ESVN01.NU_BUY_CPY = 99 AND O4Work.ESVNA0.NU_CPY_GRP = 0
{code}

Another curious behavior is that leaving out the condition {{O4Work.ESVNA0.NU_CPY_GRP = 0}} in the first SQL will give us the result of 924 all having O4Work.ESVNA0.NU_CPY_GRP = 0. So evaluation of the condition is buggy, too.

I think there may be a dependency between this issue and these ones:
[DERBY-1681|http://issues.apache.org/jira/browse/DERBY-1681] (Regression (wrong results): Join predicate can be ignored for left-most child in a chain of nested unions.)
[DERBY-1633|https://issues.apache.org/jira/browse/DERBY-1633] (Regression: The fields of views are not being calculated properly since 10.1.2.4)

Attached is an Eclipse project with the Test-Program (without the Derby-Libraries) and the several RUNTIMESTATISTICS.

 All   Comments   Work Log   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
A B added a comment - 23/Aug/07 03:27 PM
Am I right in understanding that this error (wrong results) is reproduceable with Derby 10.1.2.*1*?

If so, then that's useful information: DERBY-1681 and DERBY-1633 were regressions caused by changes for DERBY-805, and the first commit for DERBY-805 (there were many) went into 10.2 with svn #381859. The first DERBY-805 changes were then ported back to the 10.1 branch with svn #396211. But that's *after* the 10.1.2.1 release was cut--10.1.2.1 was released at svn # 330608, per:

  http://db.apache.org/derby/derby_downloads.html#Archived+Official+Releases

So if this bug does actually affect 10.1.2.*1*, then we can say with some degree of certainty that this particular Jira is *not* a regression caused by DERBY-805 (nor any of the DERBY-805 follow-up work). That's a good indication that the problem has been around for a while, which helps narrow down the searching field. It is quite possible that the problem resides in the code that was changed for DERBY-1681 and/or DERBY-1633, but it would be good to know for sure whether or not those Jiras are actually the cause, or just different problems in the same area of code.

I do not have time to try to run the repro against 10.1.2.1 right now, but if you could confirm that the problem does exist there, I do think that would be helpful...

Stefan Cordes added a comment - 29/Aug/07 12:16 PM
Hallo!
This bug does affect 10.1.2.1. Even 10.1.1.0 is affected.

I tested with some of the Derby releases and all have the exactly the same (wrong) resultset depending on the join order:
Statement10.1.1.0 - (208786)-j1.4.2_10
Statement10.1.2.1 - (330608)-j1.4.2_10
Statement10.1.2.2 - (349064)-j1.4.2_10
Statement10.1.2.2 - (370021)-j1.4.2_10
Statement10.2.2.0 - (485682)-j1.4.2_10
Statement10.3.1.4 - (561794)-j1.4.2_10

In addition I mixed our Joins to all possible combinations and found out e.g.
The join order
NA0-P00-NA4-N01-N03-NE1-N04-N08-NB0-Q00 gives 924 rows,
but
NA0-P00-NA4-N01-N03-NE1-N04-N08-Q00-NB0 only 0 rows.
(Only last to tables are switched)

Comparing the statistics the sucessful SQL with Q00 as right join (Rows seen = 924)

Right result set:
Hash Scan ResultSet for ESVQ00 at read committed isolation level using instantaneous share table locking:
Number of opens = 924
Hash table size = 60
Hash key is column number 0
Rows seen = 924
Rows filtered = 0
constructor time (milliseconds) = 0
open time (milliseconds) = 0
next time (milliseconds) = 0
close time (milliseconds) = 0
next time in milliseconds/row = 0

scan information:
Bit set of columns fetched={0, 6}
Number of columns fetched=2
Number of pages visited=3
Number of rows qualified=60
Number of rows visited=60

but on the other hand (Rows seen = 0) with NB0 as most right join:

Right result set:
Hash Scan ResultSet for ESVNB0 at read committed isolation level using instantaneous share table locking:
Number of opens = 924
Hash table size = 0
Hash keys are column numbers (1,2)
Rows seen = 0
Rows filtered = 0
constructor time (milliseconds) = 0
open time (milliseconds) = 0
next time (milliseconds) = 0
close time (milliseconds) = 0

scan information:
Bit set of columns fetched={0, 1, 2, 3, 6, 8}
Number of columns fetched=6
Number of pages visited=8
Number of rows qualified=0
Number of rows visited=240

Maybe that helps.

I attach the complete results from all versions I tested as derby-02-search-joins2.zip

A B added a comment - 21/Oct/07 02:42 AM - edited
I did some tracing through the execution-time result set processing for "new-style-sql.txt" and while I still do not understand what is going on, I *think* that part of the problem is that the two predicates for the outer-most query, i.e.:

  WHERE O4Work.ESVN01.NU_BUY_CPY = 99 AND O4Work.ESVNA0.NU_CPY_GRP = 0

are both being applied as "scan qualifiers" to the initial scans for ESVNB0 and ESVNE1, in addition to being applied to their respective target tables. That is to say, "NU_BUY_CPY = 99" translates into a qualifier on the "first column" in ESVN01 (because NU_BUY_CPY is the first column in that table); but further up the result set tree, that same qualifier is being applied to the "first column" of ESVNB0 and (still later) to the "first column" of ESVNE1--which, if that's what is actually happening, would be wrong...

Similarly, "NU_CPY_GRP = 0" translates into a qualifier on the "fourth column" of ESVNA0 (because NU_CPY_GRP is the fourth column in that table); but later that same qualifier is being applied to the "fourth column" of ESVNB0 and (still later) the "fourth column" of ESVNE1.

The reason the query returns the correct results (924 rows) if there is only one predicate, "ESVN01.NU_BUY_CPY = 99", is because, as luck would have it, the first column in both ESVNB0 and ESVNE1 is *also* NU_BUY_CPY, and it, like ESVN01, has the value 99 for all rows. So even though the predicate is (apparently) mis-applied, it doesn't affect the results because it doesn't actually filter any rows.

But the second predicate, "ESVNA0.NU_CPY_GRP = 0", does affect the results because the fourth column of ESVNB0 and ESVNE1 is "NU_MT1_CPY", and that column does not have any rows with value "0". So when the predicate is mis-applied to ESVNB0, it eliminates all of ESVNB0's rows, thus causing the query to return no results.

In an attempt to validate this somewhat bizarre theory, I updated a few of the rows in ESVNB0 to have the value "0" for column NU_MT1_CPY:

  update o4work.esvnb0 set nu_mt1_cpy = 0 where nu_mt1_cpy = 2;

When I re-ran "new-style-sql.txt", the query still returned 0 rows. Further tracing showed that we were now getting some rows back from ESVNB0, but then we were joining them with ESVNE1--and since ESVNE1 still didn't have any NU_MT1_CPY columns with value 0, we were filtering out all of ESVNE1's rows, leading to an empty result set. So I then updated ESVNE1 in a way similar to ESVNB0, ie:

  update o4work.esvne1 set nu_mt1_cpy = 0 where nu_mt1_cpy = 2;

When I did that, "new-style-sql.txt" returned 906 rows. This seems to suggest (though it's far from conclusive) that the NU_CPY_GRP qualifier is being mis-applied in the hash scans for ESVNB0 and ESNB1.

A look at the query plans for "new-style-sql_without-condition.txt" and "new-style-sql.txt" seems to suggest a similar thing. Note in particular the scan qualifiers for the hash scans on ESVNB0 and ESVNE1. For "new-style-sql_without-condition", we see a single scan qualifier on the first column:

 scan qualifiers:
 Column[0][0] Id: 0
 Operator: =
 Ordered nulls: false
 Unknown return value: false
 Negate comparison result: false

This corresponds to "NU_BUY_CPY = 99" (verified by tracing, where the qualifier value is '99'), which is always true for all rows of ESVNB0 and ESVNE1. But for "new-style-sql.txt", we see *two* scan qualifiers:

 scan qualifiers:
 Column[0][0] Id: 0
 Operator: =
 Ordered nulls: false
 Unknown return value: false
 Negate comparison result: false

 Column[0][1] Id: 3
 Operator: =
 Ordered nulls: false
 Unknown return value: false
 Negate comparison result: false

The second qualifier is on the fourth column and has a value of '0', which appears to be coming from the "NU_CPY_GRP = 0" predicate. That qualifier is eliminating all rows from the scans (for both ESVNB0 and ESVNE1), leading to zero results.

Given this potentially erroneus theory, it appears that somehow the qualifiers for the outer-most WHERE clause are being mapped incorrectly onto ESVNB0 and ESVNE1. Maybe this is a transitive closure computation problem, maybe it's something else entirely. At this point I don't know, I'm just dumping what I've seen thus far...

A B added a comment - 18/Nov/07 01:31 AM
I did some further tracing/investigation of this issue and confirmed that search transitive closure is indeed adding incorrect predicates to the predicate list. In the case of "new-style-sql.txt", we end up with four additional predicates:

   ESVNB0.NU_BUY_CPY = 99
   ESVNE1.NU_BUY_CPY = 99
   ESVNB0.NU_MT1_CPY = 0
   ESVNE1.NU_MT1_CPY = 0

All of these predicates are incorrectly inferred from two predicates in the outermost WHERE clause, namely

  O4Work.ESVN01.NU_BUY_CPY = 99 AND O4Work.ESVNA0.NU_CPY_GRP = 0

It turns out that the introduction of these invalid predicates stems from a problem that is almost identical to the one described in DERBY-2526. For details please see d2526_v1.html as attached to that issue, since pretty much everything in that document applies to this issue, as well.

In fact, the fix for that issue nearly solved this one, too. More specifically, the fix for DERBY-2526 added the following lines to the remapColumnReferencesToExpressions() method of ColumnReference.java:

    /* It's not enough to just set the table number. Depending
     * on the original query specified and on whether or not
     * subquery flattening has occurred, it's possible that
     * the expression to which we're remapping has a different
     * result column list than the one to which we were mapped
     * before we got here. In that case we also need to update
     * the columnNumber to point to the correct column in "ft".
     * That's what the following line does. See DERBY-2526 for
     * details.
     */
    ResultColumn ftRC =
        ft.getResultColumns().getResultColumn(columnName);

    columnNumber = ftRC.getColumnPosition();

That code depends on the "getColumnPosition()" method of ResultColumn, which looks as follows:

    public int getColumnPosition()
    {
        if (columnDescriptor!=null)
            return columnDescriptor.getPosition();
        else
            return virtualColumnId;
    }

The document d2526_v1.html mentioned two things regarding this code:

  1) if "columnDescriptor" is null, the method will return virtualColumnId;
  2) columnDescriptor will *not* be null if the ResultColumn (ftRC in this case)
     points directly to a base table. In that case the method will return the
     position as stored in the column descriptor.

Both of those statements are still true. But there's another fact about this code that DERBY-2526 did not catch:

  3) columnDescriptor may be NON-null *even if* ftRC points to a virtual
     column.

In the case of "new-style-sql" we'll end up calling the above code for both sides of any join predicates that belong to JoinNodes which are flattened (again, see the document attached to DERBY-2526 for details). In the case where one of the flattened JoinNode's children is a left outer join (i.e. HalfOuterJoinNode) ftRC points to the HalfOuterJoinNode. If that HalfOuterJoinNode then has base tables as its children, two things will be true: a) ftRC will point to a VIRTUAL column--namely, to one of the columns in HalfOuterJoinNode's result column, and b) ftRC will have a columnDescriptor that describes the column in the underlying base table to which ftRC ultimately points. So now we have a ResultColumn ftRC which has a non-null columnDescriptor AND which points to a virtual column.

In the above code, then, when we call:

    columnNumber = ftRC.getColumnPosition();

we'll see that columnDescriptor is non-null and thus columnNumber will become the column position w.r.t. the *base* table. But that's not quite correct: ftRC points to a virtual result column list (i.e. the RCL for HalfOuterJoinNode), so columnNumber has to reflect the correct position with respect to that virtual column list. In other words, we need to retrieve the *virtualColumnId*, not the base column position.

When we mis-assign the columnNumber for the join predicates, we end up confusing the transitive closure code, which leads to the eventual addition of incorrect predicates to the WHERE list, and therefore we end up with missing rows.

I made a slight tweak to the code added by DERBY-2526 to make sure that columnNumber is always set to the virtualColumnId when the ResultColumn in question points to a virtual column:

- columnNumber = ftRC.getColumnPosition();
+ /* Use the virtual column id if the ResultColumn's expression
+ * is a virtual column (DERBY-3023).
+ */
+ columnNumber =
+ (ftRC.getExpression() instanceof VirtualColumnNode)
+ ? ftRC.getVirtualColumnId()
+ : ftRC.getColumnPosition();

With this one-line change, "new-style-sql" returns the same results as "new-style-sql_without-condition". So that seems like a step in the right direction for this issue. Also with that change I ran the repro program, DerbySearchJoins.java, and all but two of the queries now return 924 rows--another good sign.

The two queries that still do not return 924 rows are:

  old-style-sql.txt (returns 889 rows)
  new-style-sql-only-inner.txt (returns 2 rows)

If it's true that those queries are supposed to return 924 rows, as well (Stefan, can you confirm that? Is that what DB2 shows, as well?), then further investigation is still needed before this issue can be resolved. But at least we're a step closer...

I'm attaching a simplified repro for the problem that I just described, along with a patch that includes the one-line change and a corresponding test case for the nightlies. This patch is NOT for commit, though, as I have not yet run derbyall. I did run suites.All and saw 92 failures along with 24 errors, all of which were in upgrade tests. When I re-ran the suite without my changes to ColumnReference the same tests failed, so I'm hoping it's just an environment issue.

A B added a comment - 19/Nov/07 07:49 PM
I ran derbyall on Red Hat Linux with ibm15 and there were no failures.

I realized that the reason the UpgradeTests were failing over the weekend was that I was disconnected and didn't have old release jars on my machine. When I re-ran suites.All (via "ant junit-all") today all tests passed.

So I'm re-attaching the changes as "d3023_v2.patch", which is identical to "d3023_notTested_v1.patch"--it just has a different name because the patch has now been tested.

Note that committing the patch for this issue will lead to an ASSERTion failure in the repro query "new-style-sql" due to DERBY-3214. That's not a regression, it's just a case where fixing one bug exposes another bug--and in this case, I think if one is running in insane mode DERBY-3214 will not actually reveal itself; the optimizer will just pick a potentially suboptimal query plan.

I'm *not* checking the "Patch Available" flag as I will be heading of town this evening and will be unable to handle follow-up comments until I return. But I thought I'd post the patch and the test results now in case someone else wants to drive this to commit in the interim.

Also note: still waiting to here back from Stefan on whether "old-style-sql" and "new-style-sql_only-inner" are supposed to return 924 rows, as well. If so, then there's more work to be done here (d3023_v2.patch would be an incremental step in that case).

Stefan Cordes added a comment - 20/Nov/07 12:35 PM
No, old-style-sql and new-style-sql_only-inner are *not* expected to have 924 rows.

(In the old-style-sql is a "LEFT OUTER JOIN O4Work.ESVN07" increases duplicate rows and NU_COT_TYP = 1 decreases them.
After removing NU_COT_TYP = 1 and adding a distinct I get 924 rows for old-style-sql )
(In the new-style-sql_only-inner only inner joins forces the less rows).

I adapted the program to give more hint about what expected:

Test old-style-sql
ok: 924 rows
Test old-style-sql-without-condition
ok: 924 rows
Test new-style-sql
ERROR: Expected 924 rows but were 0 rows
Test new-style-sql-only-inner
ok: 2 rows
Test new-style-sql_first-inner-joins
ok: 924 rows
Test new-style-sql_without-condition
ok: 924 rows


A B added a comment - 03/Dec/07 07:12 PM
> I adapted the program to give more hint about what expected:

Thanks Stefan. Can you attach the revised program to this issue? Just the ".java" file should be good enough...

Stefan Cordes added a comment - 11/Dec/07 02:02 PM
Here it is: DerbySearchJoins.java

A B added a comment - 16/Dec/07 03:26 AM
Marking "Patch Available" as I am now available to respond to review comments/suggestions on the description of the problem and/or on the suggested d3023_v2.patch. From what I can tell, that patch resolves the issue and causes the various queries posted by Stefan to return the correct number of rows. Stefan, please correct me if I'm wrong...

Stefan Cordes added a comment - 14/Jan/08 02:01 PM
I did not test the patch but if new-style-sql returns 924 rows it's solved for me.

A B added a comment - 16/Jan/08 04:52 PM
Having heard no objections to the patch, and based on Stefan's feedback, I committed d3023_v2.patch with svn # 612504:

  URL: http://svn.apache.org/viewvc?rev=612504&view=rev

Marking the issue as resolved for 10.4. If the tinderbox tests run cleanly over the next couple of days, I will look at porting this change back to 10.3.

A B added a comment - 21/Jan/08 11:23 PM
Tinderbox runs on trunk ran for several days with no apparent fallout from the changes for this issue, so I ported back to 10.3 with a simple merge command:

  svn merge -r 612503:612504 https://svn.apache.org/repos/asf/db/derby/code/trunk

I ran derbyall and suites.All with ibm142 and there were no new failures. So I committed with svn # 614046:

  URL: http://svn.apache.org/viewvc?rev=614046&view=rev

Updating fix-in to reflect the fact that this is now in 10.3.

Dyre Tjeldvoll added a comment - 18/Mar/08 10:13 PM
This issue has fix version 10.4 and is marked with either 'Release note needed' or 'Existing application impact', but does not have a releaseNote.html attached to it. Should it?

A B added a comment - 02/Apr/08 03:31 AM
Attaching a release note for this issue. I had to use MS Word to create the file and as a result it doesn't pass the "ReleaseNoteReader" tool check--i.e. it's going to have to be "scrubbed" in order for the release note generator to work correctly. I apologize in advance, but I'm hoping someone else can do the scrubbing as it was not as straightforward as I had hoped it would be, and I don't really have time to figure it all out...

Rick Hillegas added a comment - 02/Apr/08 01:20 PM
Thanks for the release note, Army. I have attached a scrubbed version which passes the checks performed by the ReleaseNoteReader.

A B added a comment - 02/Apr/08 04:32 PM
> I have attached a scrubbed version which passes the checks performed by the ReleaseNoteReader.

Thanks Rick!