Derby
  1. Derby
  2. DERBY-47

Some possible improvements to IN optimization

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 10.0.2.0
    • Fix Version/s: 10.3.1.4
    • Component/s: SQL
    • Labels:
      None
    • Environment:
      all

      Description

      Consider a simple case of -
      A table tbl has 10000 rows, there is a primary key index on i1
      and the query in question is
      select * from tbl where i1 in (-1,100000)

      derby does a table scan of the entire table even though the "IN" list has only two values and the comparison is on a field that has an index.

      Briefly looking at the code, it seems like we insert a between and use the IN list to get the start and stop values for the scan. Thus the range of the values in the "IN" list here plays an important role.

      Thus if the query was changed to select * from tbl where i1 in (-1, 1), an index scan would be chosen.

      It would be nice if we could do something clever in this case where there is clearly an index on the field and the number of values in the IN list is known. Maybe use the rowcount estimate and the IN list size to do some optimizations.

      • consider the length of the "IN" list to do searches on the table. ie use the IN list values to do index key searches on the table,
        -or try to convert it to a join. Use the "IN" list values to create a temporary table and do a join. It is most likely that the optimizer will choose the table with "IN" list here as the outer table in the join and thus will do key searches on the larger table.

      -------------------------------------------------------------------
      some query plans that I logged using derby.language.logQueryPlan=true for some similar queries:

      Table has ascending values from 0 - 9999 for i1. primary key index on i1.

      GMT Thread[UT0,5,main] (XID = 19941), (SESSIONID = 0), select * from scanfixed where i1 in (-1,9999,9998,9997,9996,9995,9994,9993,9992,9991,9990) ******* Project-Restrict ResultSet (2):
      Number of opens = 1
      Rows seen = 10000
      Rows filtered = 9990
      restriction = true
      projection = false
      constructor time (milliseconds) = 0
      open time (milliseconds) = 0
      next time (milliseconds) = 0
      close time (milliseconds) = 0
      restriction time (milliseconds) = 0
      projection time (milliseconds) = 0
      optimizer estimated row count: 750.38
      optimizer estimated cost: 8579.46

      Source result set:
      Table Scan ResultSet for SCANFIXED at read committed isolation level using instantaneous share row locking chosen by the optimizer
      Number of opens = 1
      Rows seen = 10000
      Rows filtered = 0
      Fetch Size = 16
      constructor time (milliseconds) = 0
      open time (milliseconds) = 0
      next time (milliseconds) = 0
      close time (milliseconds) = 0
      next time in milliseconds/row = 0

      scan information:
      Bit set of columns fetched=All
      Number of columns fetched=9
      Number of pages visited=417
      Number of rows qualified=10000
      Number of rows visited=10000
      Scan type=heap
      start position:
      null stop position:
      null qualifiers:
      Column[0][0] Id: 0
      Operator: <=
      Ordered nulls: false
      Unknown return value: false
      Negate comparison result: false
      Column[0][1] Id: 0
      Operator: <
      Ordered nulls: false
      Unknown return value: true
      Negate comparison result: true

      optimizer estimated row count: 750.38
      optimizer estimated cost: 8579.46

      ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
      l
      2004-10-14 18:59:47.577 GMT Thread[UT0,5,main] (XID = 19216), (SESSIONID = 0), select * from scanfixed where i1 in (9999,9998,9997,9996,9995,9994,9993,9992,9991,9990) ******* Project-Restrict ResultSet (3):
      Number of opens = 1
      Rows seen = 10
      Rows filtered = 0
      restriction = true
      projection = true
      constructor time (milliseconds) = 0
      open time (milliseconds) = 0
      next time (milliseconds) = 0
      close time (milliseconds) = 0
      restriction time (milliseconds) = 0
      projection time (milliseconds) = 0
      optimizer estimated row count: 4.80
      optimizer estimated cost: 39.53

      Source result set:
      Index Row to Base Row ResultSet for SCANFIXED:
      Number of opens = 1
      Rows seen = 10
      Columns accessed from heap =

      {0, 1, 2, 3, 4, 5, 6, 7, 8}

      constructor time (milliseconds) = 0
      open time (milliseconds) = 0
      next time (milliseconds) = 0
      close time (milliseconds) = 0
      optimizer estimated row count: 4.80
      optimizer estimated cost: 39.53

      Index Scan ResultSet for SCANFIXED using index SCANFIXEDX at read committed isolation level using instantaneous share row locking chosen by the optimizer
      Number of opens = 1
      Rows seen = 10
      Rows filtered = 0
      Fetch Size = 16
      constructor time (milliseconds) = 0
      open time (milliseconds) = 0
      next time (milliseconds) = 0
      close time (milliseconds) = 0
      next time in milliseconds/row = 0

      scan information:
      Bit set of columns fetched=All
      Number of columns fetched=2
      Number of deleted rows visited=0
      Number of pages visited=2
      Number of rows qualified=10
      Number of rows visited=10
      Scan type=btree
      Tree height=2
      start position:
      >= on first 1 column(s).
      Ordered null semantics on the following columns:

      stop position:
      > on first 1 column(s).
      Ordered null semantics on the following columns:

      qualifiers:
      None
      optimizer estimated row count: 4.80
      optimizer estimated cost: 39.53

      1. d47_beforeAndAfter.html
        112 kB
        A B
      2. d47_beforeAndAfter.html
        112 kB
        A B
      3. d47_mp_junitTest_v1.patch
        32 kB
        A B
      4. d47_mp_cleanup_v1.stat
        0.6 kB
        A B
      5. d47_mp_cleanup_v1.patch
        17 kB
        A B
      6. d47_mp_preprocess_v2.patch
        20 kB
        A B
      7. Derby47PerformanceTest.java
        38 kB
        James Synge
      8. readlocks_withContext.diff
        34 kB
        A B
      9. d47_mp_preprocess_v1.stat
        0.8 kB
        A B
      10. d47_mp_masters_v1.patch
        59 kB
        A B
      11. d47_mp_preprocess_v1.patch
        18 kB
        A B
      12. d47_mp_exec_v1.stat
        0.4 kB
        A B
      13. d47_mp_exec_v1.patch
        23 kB
        A B
      14. d47_mp_codeGen_v1.stat
        0.9 kB
        A B
      15. d47_mp_codeGen_v1.patch
        28 kB
        A B
      16. d47_mp_addlTestCases.patch
        37 kB
        A B
      17. d47_mp_CBO_MoAP_v1.stat
        0.2 kB
        A B
      18. d47_mp_CBO_MoAP_v1.patch
        10 kB
        A B
      19. d47_mp_relOpPredCheck_v1.stat
        0.2 kB
        A B
      20. d47_mp_relOpPredCheck_v1.patch
        9 kB
        A B
      21. readlocks_withContext.diff
        43 kB
        A B
      22. readlocks.diff
        15 kB
        A B
      23. d47_engine_doNotCommit_v1.stat
        2 kB
        A B
      24. d47_engine_doNotCommit_v1.patch
        68 kB
        A B
      25. InListOperatorNode.java
        24 kB
        James Synge
      26. Derby47PerformanceTest.java
        38 kB
        James Synge
      27. derby-47-performance-data.txt
        4 kB
        James Synge
      28. derby-47-performance-data.txt
        4 kB
        James Synge
      29. Derby47PerformanceTest.java
        32 kB
        James Synge
      30. QueryPlanUniqueIndexOnlyTwoTerms.txt
        3 kB
        Kevin Hore
      31. QueryPlanUniqueIndexOnlyOneTerm.txt
        3 kB
        Kevin Hore
      32. QueryPlanUniqueIndexAndWordIndexTwoTerms.txt
        4 kB
        Kevin Hore
      33. QueryPlanUniqueIndexAndWordIndexOneTerm.txt
        3 kB
        Kevin Hore

        Issue Links

          Activity

          Hide
          Kevin Hore added a comment -

          I have also found this issue to be a problem and would like to know whether there are any plans to fix it?

          What follows is a copy of discussion from the derby-users mailing list form 2005/11/11 "Poor query optimizer choices is making Derby unusable for large tables", it describes how the same behaviour is causing problems for a query with a GROUP BY clause:

          -----------------------------------------------------------------------
          Hi Kevin,

          Kevin Hore wrote:

          >> i) Does anyone have any plans to fix this problem?

          Can you file an enhancement request for this? I think Derby could
          improve it's handling of OR/IN clauses. Many databases don't optimize OR
          clauses as much as possible, though some do better than others. It would
          be great if Derby could internally process this as two different scans
          (one for 'CONTACT' and another for 'ADD') and then combine the results.
          Some databases can do this. However, the workaround suggested by Jeff L.
          does this, though you have to rewrite the query.

          Satheesh

          >> ii) In the meantime, are there any work-arounds? I'd appreciate any
          >> suggestions that would decrease the execution time of my second query
          >> below (the one with with two search terms). Likewise, any general
          >> strategies for avoiding this problem with IN clauses would be
          >> appreciated.
          >>
          >>
          >> ---PROBLEM DESCRIPTION---
          >>
          >> Consider the table:
          >>
          >> CREATE TABLE tblSearchDictionary
          >> (
          >> ObjectId int NOT NULL,
          >> ObjectType int NOT NULL,
          >> Word VARCHAR(64) NOT NULL,
          >> WordLocation int NOT NULL,
          >> CONSTRAINT CONSd0e222 UNIQUE (ObjectId,ObjectType,Word,WordLocation)
          >> );
          >>
          >> This table has an index on each of the four columns, it also has the
          >> unique index across all four columns as defined above:
          >>
          >> CREATE INDEX tblSearchDictionaryObjectId ON tblSearchDictionary
          >> (ObjectId);
          >> CREATE INDEX tblSearchDictionaryObjectType ON tblSearchDictionary
          >> (ObjectType);
          >> CREATE INDEX tblSearchDictionaryWord ON tblSearchDictionary (Word);
          >> CREATE INDEX tblSearchDictionaryWordLocation ON tblSearchDictionary
          >> (WordLocation);
          >>
          >> The table contains about 260,000 rows.
          >>
          >> The following query selects all rows that match instances of string in
          >> the Word column. It sums the WordLocation column having grouped by the
          >> ObjectId column.
          >>
          >> SELECT ObjectId, SUM(WordLocation) AS Score
          >> FROM tblSearchDictionary
          >> WHERE Word = 'CONTACT'
          >> GROUP BY ObjectId;
          >>
          >> On my machine this will usually complete in an acceptable time of
          >> around 200ms.
          >>
          >> Now consider the following query which adds a second search term on
          >> the same column.
          >>
          >> SELECT ObjectId, SUM(WordLocation) AS Score
          >> FROM tblSearchDictionary
          >> WHERE Word = 'CONTACT' OR Word = 'ADD'
          >> GROUP BY ObjectId;
          >>
          >> This second query usually takes around 10000ms on my machine. My
          >> understanding from the Derby optimizer docs and DERBY-47 is that this
          >> is because Derby is re-writing the query along the following lines,
          >> and then choosing to do a table scan:
          >>
          >> SELECT ObjectId, SUM(WordLocation) AS Score
          >> FROM tblSearchDictionary
          >> WHERE
          >> Word IN ('CONTACT', 'ADD')
          >> AND Word >= 'ADD'
          >> AND Word <= 'CONTACT'
          >> GROUP BY ObjectId;
          >>
          >> The plan for the first query indicates that the tblSearchDictionaryWord
          >> index is used to perform an index scan. However, the plan for the second
          >> query indicates that the majority of the additional time is taken
          >> performing a table scan over the entire table, instead of making use of
          >> the indexes available. Our application uses IN quite frequently, so
          >> this optimizer behaviour would seem to present a significant problem.
          >>
          >> --QUERY PLAN FOR FIRST QUERY---
          >>
          >> Statement Name:
          >> null
          >> Statement Text:
          >> SELECT
          >> ObjectId,
          >> SUM(WordLocation) AS Score
          >> FROM tblSearchDictionary
          >> WHERE
          >> Word = 'CONTACT'
          >> GROUP BY
          >> ObjectId
          >>
          >> Parse Time: 0
          >> Bind Time: 0
          >> Optimize Time: 16
          >> Generate Time: 0
          >> Compile Time: 16
          >> Execute Time: 0
          >> Begin Compilation Timestamp : 2005-11-11 12:28:52.765
          >> End Compilation Timestamp : 2005-11-11 12:28:52.781
          >> Begin Execution Timestamp : 2005-11-11 13:08:15.406
          >> End Execution Timestamp : 2005-11-11 13:08:15.406
          >> Statement Execution Plan Text:
          >> Project-Restrict ResultSet (5):
          >> Number of opens = 1
          >> Rows seen = 93
          >> Rows filtered = 0
          >> restriction = false
          >> projection = true
          >> constructor time (milliseconds) = 0
          >> open time (milliseconds) = 0
          >> next time (milliseconds) = 0
          >> close time (milliseconds) = 0
          >> restriction time (milliseconds) = 0
          >> projection time (milliseconds) = 0
          >> optimizer estimated row count: 1.00
          >> optimizer estimated cost: 226.00
          >>
          >> Source result set:
          >> Grouped Aggregate ResultSet:
          >> Number of opens = 1
          >> Rows input = 113
          >> Has distinct aggregate = false
          >> In sorted order = false
          >> Sort information:
          >> Number of rows input=113
          >> Number of rows output=93
          >> Sort type=internal
          >> constructor time (milliseconds) = 0
          >> open time (milliseconds) = 0
          >> next time (milliseconds) = 0
          >> close time (milliseconds) = 0
          >> optimizer estimated row count: 1.00
          >> optimizer estimated cost: 226.00
          >>
          >> Source result set:
          >> Project-Restrict ResultSet (4):
          >> Number of opens = 1
          >> Rows seen = 113
          >> Rows filtered = 0
          >> restriction = false
          >> projection = true
          >> constructor time (milliseconds) = 0
          >> open time (milliseconds) = 0
          >> next time (milliseconds) = 0
          >> close time (milliseconds) = 0
          >> restriction time (milliseconds) = 0
          >> projection time (milliseconds) = 0
          >> optimizer estimated row count: 118.00
          >> optimizer estimated cost: 226.00
          >>
          >> Source result set:
          >> Index Row to Base Row ResultSet for TBLSEARCHDICTIONARY:
          >> Number of opens = 1
          >> Rows seen = 113
          >> Columns accessed from heap =

          {0, 3}

          >> constructor time (milliseconds) = 0
          >> open time (milliseconds) = 0
          >> next time (milliseconds) = 0
          >> close time (milliseconds) = 0
          >> optimizer estimated row count: 118.00
          >> optimizer estimated cost: 226.00
          >>
          >> Index Scan ResultSet for TBLSEARCHDICTIONARY using index
          >> TBLSEARCHDICTIONARYWORD at read committed isolation level using share
          >> row locking chosen by the optimizer
          >> Number of opens = 1
          >> Rows seen = 113
          >> Rows filtered = 0
          >> Fetch Size = 1
          >> constructor time (milliseconds) = 0
          >> open time (milliseconds) = 0
          >> next time (milliseconds) = 0
          >> close time (milliseconds) = 0
          >> next time in milliseconds/row = 0
          >>
          >> scan information:
          >> Bit set of columns fetched=All
          >> Number of columns fetched=2
          >> Number of deleted rows visited=0
          >> Number of pages visited=4
          >> Number of rows qualified=113
          >> Number of rows visited=114
          >> Scan type=btree
          >> Tree height=3
          >> start position:
          >> >= on first 1 column(s).
          >> Ordered null semantics on the following columns:
          >> 0
          >> stop position:
          >> > on first 1 column(s).
          >> Ordered null semantics on the following columns:
          >> 0
          >> qualifiers:
          >> None
          >> optimizer estimated row count: 118.00
          >> optimizer estimated cost: 226.00
          >>
          >>
          >> --QUERY PLAN FOR SECOND QUERY---
          >>
          >> Statement Name:
          >> null
          >> Statement Text:
          >> SELECT
          >> ObjectId,
          >> SUM(WordLocation) AS Score
          >> FROM tblSearchDictionary
          >> WHERE
          >> Word = 'CONTACT' OR Word = 'ADD'
          >> GROUP BY
          >> ObjectId
          >>
          >> Parse Time: 0
          >> Bind Time: 0
          >> Optimize Time: 0
          >> Generate Time: 15
          >> Compile Time: 15
          >> Execute Time: 4250
          >> Begin Compilation Timestamp : 2005-11-11 13:16:17.578
          >> End Compilation Timestamp : 2005-11-11 13:16:17.593
          >> Begin Execution Timestamp : 2005-11-11 13:16:17.593
          >> End Execution Timestamp : 2005-11-11 13:16:27.437
          >> Statement Execution Plan Text:
          >> Project-Restrict ResultSet (5):
          >> Number of opens = 1
          >> Rows seen = 100
          >> Rows filtered = 0
          >> restriction = false
          >> projection = true
          >> constructor time (milliseconds) = 0
          >> open time (milliseconds) = 4250
          >> next time (milliseconds) = 0
          >> close time (milliseconds) = 0
          >> restriction time (milliseconds) = 0
          >> projection time (milliseconds) = 0
          >> optimizer estimated row count: 1.00
          >> optimizer estimated cost: 82959.49
          >>
          >> Source result set:
          >> Grouped Aggregate ResultSet:
          >> Number of opens = 1
          >> Rows input = 712
          >> Has distinct aggregate = false
          >> In sorted order = false
          >> Sort information:
          >> Number of rows input=712
          >> Number of rows output=593
          >> Sort type=internal
          >> constructor time (milliseconds) = 0
          >> open time (milliseconds) = 4250
          >> next time (milliseconds) = 0
          >> close time (milliseconds) = 0
          >> optimizer estimated row count: 1.00
          >> optimizer estimated cost: 82959.49
          >>
          >> Source result set:
          >> Project-Restrict ResultSet (4):
          >> Number of opens = 1
          >> Rows seen = 712
          >> Rows filtered = 0
          >> restriction = false
          >> projection = true
          >> constructor time (milliseconds) = 0
          >> open time (milliseconds) = 0
          >> next time (milliseconds) = 4219
          >> close time (milliseconds) = 15
          >> restriction time (milliseconds) = 0
          >> projection time (milliseconds) = 0
          >> optimizer estimated row count: 19200.45
          >> optimizer estimated cost: 82959.49
          >>
          >> Source result set:
          >> Project-Restrict ResultSet (3):
          >> Number of opens = 1
          >> Rows seen = 40806
          >> Rows filtered = 40094
          >> restriction = true
          >> projection = false
          >> constructor time (milliseconds) = 0
          >> open time (milliseconds) = 0
          >> next time (milliseconds) = 4219
          >> close time (milliseconds) = 15
          >> restriction time (milliseconds) = 124
          >> projection time (milliseconds) = 0
          >> optimizer estimated row count: 19200.45
          >> optimizer estimated cost: 82959.49
          >>
          >> Source result set:
          >> Table Scan ResultSet for TBLSEARCHDICTIONARY at read
          >> committed
          >> isolation level using share row locking chosen by the optimizer
          >> Number of opens = 1
          >> Rows seen = 40806
          >> Rows filtered = 0
          >> Fetch Size = 1
          >> constructor time (milliseconds) = 0
          >> open time (milliseconds) = 0
          >> next time (milliseconds) = 4001
          >> close time (milliseconds) = 15
          >> next time in milliseconds/row = 0
          >>
          >> scan information:
          >> Bit set of columns fetched=

          {0, 2, 3}

          >> Number of columns fetched=3
          >> Number of pages visited=2978
          >> Number of rows qualified=40806
          >> Number of rows visited=256001
          >> Scan type=heap
          >> start position:
          >> null stop position:
          >> null qualifiers:
          >> Column[0][0] Id: 2
          >> Operator: <
          >> Ordered nulls: false
          >> Unknown return value: true
          >> Negate comparison result: true
          >> Column[0][1] Id: 2
          >> Operator: <=
          >> Ordered nulls: false
          >> Unknown return value: false
          >> Negate comparison result: false
          >>
          >> optimizer estimated row count: 19200.45
          >> optimizer estimated cost: 82959.49
          >>
          >> ----------
          >>
          >> Thanks in advance for any help!
          >>
          >> Kind regards,
          >>
          >>
          >> Kevin Hore
          >>
          >>
          >>

          Show
          Kevin Hore added a comment - I have also found this issue to be a problem and would like to know whether there are any plans to fix it? What follows is a copy of discussion from the derby-users mailing list form 2005/11/11 "Poor query optimizer choices is making Derby unusable for large tables", it describes how the same behaviour is causing problems for a query with a GROUP BY clause: ----------------------------------------------------------------------- Hi Kevin, Kevin Hore wrote: >> i) Does anyone have any plans to fix this problem? Can you file an enhancement request for this? I think Derby could improve it's handling of OR/IN clauses. Many databases don't optimize OR clauses as much as possible, though some do better than others. It would be great if Derby could internally process this as two different scans (one for 'CONTACT' and another for 'ADD') and then combine the results. Some databases can do this. However, the workaround suggested by Jeff L. does this, though you have to rewrite the query. Satheesh >> ii) In the meantime, are there any work-arounds? I'd appreciate any >> suggestions that would decrease the execution time of my second query >> below (the one with with two search terms). Likewise, any general >> strategies for avoiding this problem with IN clauses would be >> appreciated. >> >> >> --- PROBLEM DESCRIPTION --- >> >> Consider the table: >> >> CREATE TABLE tblSearchDictionary >> ( >> ObjectId int NOT NULL, >> ObjectType int NOT NULL, >> Word VARCHAR(64) NOT NULL, >> WordLocation int NOT NULL, >> CONSTRAINT CONSd0e222 UNIQUE (ObjectId,ObjectType,Word,WordLocation) >> ); >> >> This table has an index on each of the four columns, it also has the >> unique index across all four columns as defined above: >> >> CREATE INDEX tblSearchDictionaryObjectId ON tblSearchDictionary >> (ObjectId); >> CREATE INDEX tblSearchDictionaryObjectType ON tblSearchDictionary >> (ObjectType); >> CREATE INDEX tblSearchDictionaryWord ON tblSearchDictionary (Word); >> CREATE INDEX tblSearchDictionaryWordLocation ON tblSearchDictionary >> (WordLocation); >> >> The table contains about 260,000 rows. >> >> The following query selects all rows that match instances of string in >> the Word column. It sums the WordLocation column having grouped by the >> ObjectId column. >> >> SELECT ObjectId, SUM(WordLocation) AS Score >> FROM tblSearchDictionary >> WHERE Word = 'CONTACT' >> GROUP BY ObjectId; >> >> On my machine this will usually complete in an acceptable time of >> around 200ms. >> >> Now consider the following query which adds a second search term on >> the same column. >> >> SELECT ObjectId, SUM(WordLocation) AS Score >> FROM tblSearchDictionary >> WHERE Word = 'CONTACT' OR Word = 'ADD' >> GROUP BY ObjectId; >> >> This second query usually takes around 10000ms on my machine. My >> understanding from the Derby optimizer docs and DERBY-47 is that this >> is because Derby is re-writing the query along the following lines, >> and then choosing to do a table scan: >> >> SELECT ObjectId, SUM(WordLocation) AS Score >> FROM tblSearchDictionary >> WHERE >> Word IN ('CONTACT', 'ADD') >> AND Word >= 'ADD' >> AND Word <= 'CONTACT' >> GROUP BY ObjectId; >> >> The plan for the first query indicates that the tblSearchDictionaryWord >> index is used to perform an index scan. However, the plan for the second >> query indicates that the majority of the additional time is taken >> performing a table scan over the entire table, instead of making use of >> the indexes available. Our application uses IN quite frequently, so >> this optimizer behaviour would seem to present a significant problem. >> >> -- QUERY PLAN FOR FIRST QUERY --- >> >> Statement Name: >> null >> Statement Text: >> SELECT >> ObjectId, >> SUM(WordLocation) AS Score >> FROM tblSearchDictionary >> WHERE >> Word = 'CONTACT' >> GROUP BY >> ObjectId >> >> Parse Time: 0 >> Bind Time: 0 >> Optimize Time: 16 >> Generate Time: 0 >> Compile Time: 16 >> Execute Time: 0 >> Begin Compilation Timestamp : 2005-11-11 12:28:52.765 >> End Compilation Timestamp : 2005-11-11 12:28:52.781 >> Begin Execution Timestamp : 2005-11-11 13:08:15.406 >> End Execution Timestamp : 2005-11-11 13:08:15.406 >> Statement Execution Plan Text: >> Project-Restrict ResultSet (5): >> Number of opens = 1 >> Rows seen = 93 >> Rows filtered = 0 >> restriction = false >> projection = true >> constructor time (milliseconds) = 0 >> open time (milliseconds) = 0 >> next time (milliseconds) = 0 >> close time (milliseconds) = 0 >> restriction time (milliseconds) = 0 >> projection time (milliseconds) = 0 >> optimizer estimated row count: 1.00 >> optimizer estimated cost: 226.00 >> >> Source result set: >> Grouped Aggregate ResultSet: >> Number of opens = 1 >> Rows input = 113 >> Has distinct aggregate = false >> In sorted order = false >> Sort information: >> Number of rows input=113 >> Number of rows output=93 >> Sort type=internal >> constructor time (milliseconds) = 0 >> open time (milliseconds) = 0 >> next time (milliseconds) = 0 >> close time (milliseconds) = 0 >> optimizer estimated row count: 1.00 >> optimizer estimated cost: 226.00 >> >> Source result set: >> Project-Restrict ResultSet (4): >> Number of opens = 1 >> Rows seen = 113 >> Rows filtered = 0 >> restriction = false >> projection = true >> constructor time (milliseconds) = 0 >> open time (milliseconds) = 0 >> next time (milliseconds) = 0 >> close time (milliseconds) = 0 >> restriction time (milliseconds) = 0 >> projection time (milliseconds) = 0 >> optimizer estimated row count: 118.00 >> optimizer estimated cost: 226.00 >> >> Source result set: >> Index Row to Base Row ResultSet for TBLSEARCHDICTIONARY: >> Number of opens = 1 >> Rows seen = 113 >> Columns accessed from heap = {0, 3} >> constructor time (milliseconds) = 0 >> open time (milliseconds) = 0 >> next time (milliseconds) = 0 >> close time (milliseconds) = 0 >> optimizer estimated row count: 118.00 >> optimizer estimated cost: 226.00 >> >> Index Scan ResultSet for TBLSEARCHDICTIONARY using index >> TBLSEARCHDICTIONARYWORD at read committed isolation level using share >> row locking chosen by the optimizer >> Number of opens = 1 >> Rows seen = 113 >> Rows filtered = 0 >> Fetch Size = 1 >> constructor time (milliseconds) = 0 >> open time (milliseconds) = 0 >> next time (milliseconds) = 0 >> close time (milliseconds) = 0 >> next time in milliseconds/row = 0 >> >> scan information: >> Bit set of columns fetched=All >> Number of columns fetched=2 >> Number of deleted rows visited=0 >> Number of pages visited=4 >> Number of rows qualified=113 >> Number of rows visited=114 >> Scan type=btree >> Tree height=3 >> start position: >> >= on first 1 column(s). >> Ordered null semantics on the following columns: >> 0 >> stop position: >> > on first 1 column(s). >> Ordered null semantics on the following columns: >> 0 >> qualifiers: >> None >> optimizer estimated row count: 118.00 >> optimizer estimated cost: 226.00 >> >> >> -- QUERY PLAN FOR SECOND QUERY --- >> >> Statement Name: >> null >> Statement Text: >> SELECT >> ObjectId, >> SUM(WordLocation) AS Score >> FROM tblSearchDictionary >> WHERE >> Word = 'CONTACT' OR Word = 'ADD' >> GROUP BY >> ObjectId >> >> Parse Time: 0 >> Bind Time: 0 >> Optimize Time: 0 >> Generate Time: 15 >> Compile Time: 15 >> Execute Time: 4250 >> Begin Compilation Timestamp : 2005-11-11 13:16:17.578 >> End Compilation Timestamp : 2005-11-11 13:16:17.593 >> Begin Execution Timestamp : 2005-11-11 13:16:17.593 >> End Execution Timestamp : 2005-11-11 13:16:27.437 >> Statement Execution Plan Text: >> Project-Restrict ResultSet (5): >> Number of opens = 1 >> Rows seen = 100 >> Rows filtered = 0 >> restriction = false >> projection = true >> constructor time (milliseconds) = 0 >> open time (milliseconds) = 4250 >> next time (milliseconds) = 0 >> close time (milliseconds) = 0 >> restriction time (milliseconds) = 0 >> projection time (milliseconds) = 0 >> optimizer estimated row count: 1.00 >> optimizer estimated cost: 82959.49 >> >> Source result set: >> Grouped Aggregate ResultSet: >> Number of opens = 1 >> Rows input = 712 >> Has distinct aggregate = false >> In sorted order = false >> Sort information: >> Number of rows input=712 >> Number of rows output=593 >> Sort type=internal >> constructor time (milliseconds) = 0 >> open time (milliseconds) = 4250 >> next time (milliseconds) = 0 >> close time (milliseconds) = 0 >> optimizer estimated row count: 1.00 >> optimizer estimated cost: 82959.49 >> >> Source result set: >> Project-Restrict ResultSet (4): >> Number of opens = 1 >> Rows seen = 712 >> Rows filtered = 0 >> restriction = false >> projection = true >> constructor time (milliseconds) = 0 >> open time (milliseconds) = 0 >> next time (milliseconds) = 4219 >> close time (milliseconds) = 15 >> restriction time (milliseconds) = 0 >> projection time (milliseconds) = 0 >> optimizer estimated row count: 19200.45 >> optimizer estimated cost: 82959.49 >> >> Source result set: >> Project-Restrict ResultSet (3): >> Number of opens = 1 >> Rows seen = 40806 >> Rows filtered = 40094 >> restriction = true >> projection = false >> constructor time (milliseconds) = 0 >> open time (milliseconds) = 0 >> next time (milliseconds) = 4219 >> close time (milliseconds) = 15 >> restriction time (milliseconds) = 124 >> projection time (milliseconds) = 0 >> optimizer estimated row count: 19200.45 >> optimizer estimated cost: 82959.49 >> >> Source result set: >> Table Scan ResultSet for TBLSEARCHDICTIONARY at read >> committed >> isolation level using share row locking chosen by the optimizer >> Number of opens = 1 >> Rows seen = 40806 >> Rows filtered = 0 >> Fetch Size = 1 >> constructor time (milliseconds) = 0 >> open time (milliseconds) = 0 >> next time (milliseconds) = 4001 >> close time (milliseconds) = 15 >> next time in milliseconds/row = 0 >> >> scan information: >> Bit set of columns fetched= {0, 2, 3} >> Number of columns fetched=3 >> Number of pages visited=2978 >> Number of rows qualified=40806 >> Number of rows visited=256001 >> Scan type=heap >> start position: >> null stop position: >> null qualifiers: >> Column [0] [0] Id: 2 >> Operator: < >> Ordered nulls: false >> Unknown return value: true >> Negate comparison result: true >> Column [0] [1] Id: 2 >> Operator: <= >> Ordered nulls: false >> Unknown return value: false >> Negate comparison result: false >> >> optimizer estimated row count: 19200.45 >> optimizer estimated cost: 82959.49 >> >> ---------- >> >> Thanks in advance for any help! >> >> Kind regards, >> >> >> Kevin Hore >> >> >>
          Hide
          Kevin Hore added a comment -

          I have evidence (presented below) that the query optimizer is making some very poor choices when deciding how to make use of the available indexes. Unfortunately, this is making Derby unusable for the application I'm working on.

          Clearly, the Derby engine is fundamentally capable of executing queries in a sensible amount of time, but the query optimizer appears to get confused if there are several indexes to choose from and multiple references to the same column in the WHERE clause.

          In my previous, comment I gave details of simple example that demonstrated the problem. To recap, I am executing variants of the following SQL against a relatively simple table:

          SELECT ObjectId, SUM(WordLocation) AS Score
          FROM tblSearchDictionary
          WHERE Word = 'CONTACT' OR Word = 'ADD'
          GROUP BY ObjectId;

          It makes no difference to the query performance or query plan if the WHERE clause is as above, or is re-written as:

          WHERE Word IN ('CONTACT', 'ADD')

          – ORIGINAL RESULTS: –

          The timings with the schema described in my previous comment are:

          Matching one term: ~200ms avg.
          Matching two terms: ~10000ms avg.

          The "matching one term" timings are with the following WHERE clause:

          WHERE Word = 'CONTACT'

          (searching just for 'ADD' gives similar timings).

          The "matching two terms" timings are with the following WHERE clause:

          WHERE Word = 'CONTACT' OR Word = 'ADD'

          The query plans for these timings can be found in my previous comment.

          With some suggestions from the derby-users list, I have attempted to redefine the indexes on the table to see if that will have any effect on the choices made by the query planner.

          Dropping all the column indexes and then reordering the unique constraint so that the most varying column is first, and the least varying last, caused the query planner to change its plan to use the index backing the unique constraint to match the terms.

          DROP INDEX TBLSEARCHDICTIONARYOBJECTID;
          DROP INDEX TBLSEARCHDICTIONARYOBJECTTYPE;
          DROP INDEX TBLSEARCHDICTIONARYWORD;
          DROP INDEX TBLSEARCHDICTIONARYWORDLOCATION;
          ALTER TABLE TBLSEARCHDICTIONARY DROP UNIQUE CONSd0e222
          ALTER TABLE TBLSEARCHDICTIONARY ADD CONSTRAINT CONSd0e222 UNIQUE
          (ObjectId,Word,WordLocation,ObjectType);

          However from the results you can see that the search for a single term suffered a big loss in performance.

          – RESULTS WITH OPTIMIZED CONSTRAINT: –

          Matching one term: ~4000ms avg.
          Matching two terms: ~600ms avg.

          I have attached the relevant query plans for each of these results.

          This is a very surprising result: matching one term performs far worse than matching two.

          In an attempt to remedy problem of the poor single term performance, I re-introduced the index to the Word column with the following schema change:

          CREATE INDEX tblSearchDictionaryWord ON tblSearchDictionary (Word );

          – RESULTS WITH OPTIMIZED CONSTRAINT AND INDEX ON WORD FIELD: –

          Matching one term: ~200ms avg.
          Matching two terms: ~4500 ms avg.

          Again, I have attached the relevant query plans.

          Although the additional index is used in the single term query, it can be seen that adding the additional index has the effect of causing the planner to get confused about which indexes to use for the two term case. The plan here shows an index scan of both indexes, resulting in an average time seven times worse than before.

          I should add here that replacing the WHERE OR clause for the two term query with an IN() produces exactly the same plan and results, which is what one would expect based on the Derby documentation's description of how the optimizer re-writes WHERE OR clauses.

          It seems to me that the optimizer is broken when processing WHERE clauses with OR and WHERE clauses with IN. The engine is obviously capable of executing my example in a reasonable time, but the planner's choices let it down. This causes significant problems in large applications with many complex queries, particularly where the number of terms in an IN clause varies, as it then becomes almost impossible to construct queries that execute with reliably acceptable performance.

          It may be interesting to note that the application I'm working on also can also use Mckoi or SQL Server 2000. Both the single and double term searches execute with acceptable performance with both of those database engines.


          Kevin Hore
          www.araxis.com

          Show
          Kevin Hore added a comment - I have evidence (presented below) that the query optimizer is making some very poor choices when deciding how to make use of the available indexes. Unfortunately, this is making Derby unusable for the application I'm working on. Clearly, the Derby engine is fundamentally capable of executing queries in a sensible amount of time, but the query optimizer appears to get confused if there are several indexes to choose from and multiple references to the same column in the WHERE clause. In my previous, comment I gave details of simple example that demonstrated the problem. To recap, I am executing variants of the following SQL against a relatively simple table: SELECT ObjectId, SUM(WordLocation) AS Score FROM tblSearchDictionary WHERE Word = 'CONTACT' OR Word = 'ADD' GROUP BY ObjectId; It makes no difference to the query performance or query plan if the WHERE clause is as above, or is re-written as: WHERE Word IN ('CONTACT', 'ADD') – ORIGINAL RESULTS: – The timings with the schema described in my previous comment are: Matching one term: ~200ms avg. Matching two terms: ~10000ms avg. The "matching one term" timings are with the following WHERE clause: WHERE Word = 'CONTACT' (searching just for 'ADD' gives similar timings). The "matching two terms" timings are with the following WHERE clause: WHERE Word = 'CONTACT' OR Word = 'ADD' The query plans for these timings can be found in my previous comment. With some suggestions from the derby-users list, I have attempted to redefine the indexes on the table to see if that will have any effect on the choices made by the query planner. Dropping all the column indexes and then reordering the unique constraint so that the most varying column is first, and the least varying last, caused the query planner to change its plan to use the index backing the unique constraint to match the terms. DROP INDEX TBLSEARCHDICTIONARYOBJECTID; DROP INDEX TBLSEARCHDICTIONARYOBJECTTYPE; DROP INDEX TBLSEARCHDICTIONARYWORD; DROP INDEX TBLSEARCHDICTIONARYWORDLOCATION; ALTER TABLE TBLSEARCHDICTIONARY DROP UNIQUE CONSd0e222 ALTER TABLE TBLSEARCHDICTIONARY ADD CONSTRAINT CONSd0e222 UNIQUE (ObjectId,Word,WordLocation,ObjectType); However from the results you can see that the search for a single term suffered a big loss in performance. – RESULTS WITH OPTIMIZED CONSTRAINT: – Matching one term: ~4000ms avg. Matching two terms: ~600ms avg. I have attached the relevant query plans for each of these results. This is a very surprising result: matching one term performs far worse than matching two. In an attempt to remedy problem of the poor single term performance, I re-introduced the index to the Word column with the following schema change: CREATE INDEX tblSearchDictionaryWord ON tblSearchDictionary (Word ); – RESULTS WITH OPTIMIZED CONSTRAINT AND INDEX ON WORD FIELD: – Matching one term: ~200ms avg. Matching two terms: ~4500 ms avg. Again, I have attached the relevant query plans. Although the additional index is used in the single term query, it can be seen that adding the additional index has the effect of causing the planner to get confused about which indexes to use for the two term case. The plan here shows an index scan of both indexes, resulting in an average time seven times worse than before. I should add here that replacing the WHERE OR clause for the two term query with an IN() produces exactly the same plan and results, which is what one would expect based on the Derby documentation's description of how the optimizer re-writes WHERE OR clauses. It seems to me that the optimizer is broken when processing WHERE clauses with OR and WHERE clauses with IN. The engine is obviously capable of executing my example in a reasonable time, but the planner's choices let it down. This causes significant problems in large applications with many complex queries, particularly where the number of terms in an IN clause varies, as it then becomes almost impossible to construct queries that execute with reliably acceptable performance. It may be interesting to note that the application I'm working on also can also use Mckoi or SQL Server 2000. Both the single and double term searches execute with acceptable performance with both of those database engines. – Kevin Hore www.araxis.com
          Hide
          Kevin Hore added a comment -

          Query plan for comment made by Kevin Hore 2005/11/15. This is the query plan produced by the optimizer when matching only a single term in a table that contains only the revised unique index.

          Show
          Kevin Hore added a comment - Query plan for comment made by Kevin Hore 2005/11/15. This is the query plan produced by the optimizer when matching only a single term in a table that contains only the revised unique index.
          Hide
          Kevin Hore added a comment -

          Query plan for comment made by Kevin Hore 2005/11/15. This is the query plan produced by the optimizer when matchingtwo terms in a table that contains only the revised unique index.

          Show
          Kevin Hore added a comment - Query plan for comment made by Kevin Hore 2005/11/15. This is the query plan produced by the optimizer when matchingtwo terms in a table that contains only the revised unique index.
          Hide
          Kevin Hore added a comment -

          Query plan for comment made by Kevin Hore 2005/11/15. This is the query plan produced by the optimizer when matching only a single term in a table that contains only the revised unique index.

          Show
          Kevin Hore added a comment - Query plan for comment made by Kevin Hore 2005/11/15. This is the query plan produced by the optimizer when matching only a single term in a table that contains only the revised unique index.
          Hide
          Kevin Hore added a comment -

          Query plan for comment made by Kevin Hore 2005/11/15. This is the query plan produced by the optimizer when matching two terms in a table that contains only the revised unique index.

          Show
          Kevin Hore added a comment - Query plan for comment made by Kevin Hore 2005/11/15. This is the query plan produced by the optimizer when matching two terms in a table that contains only the revised unique index.
          Hide
          Satheesh Bandaram added a comment -

          Is it possible to contribute a test case with data that I can quickly setup to try on my machine? Make sure your data is sharable (to the whole world!). I do see some numbers here that I don't understand how it is possible, especially the case of two terms being faster than one. I would like to confirm if because of possible page cache and/or other suspects I have.

          Show
          Satheesh Bandaram added a comment - Is it possible to contribute a test case with data that I can quickly setup to try on my machine? Make sure your data is sharable (to the whole world!). I do see some numbers here that I don't understand how it is possible, especially the case of two terms being faster than one. I would like to confirm if because of possible page cache and/or other suspects I have.
          Hide
          Daniel James Neades added a comment -

          I'll try to schedule some time for one of our engineers to contribute a test case with data. We're in the middle of a release cycle right now, so it might not be done straight away. Thank you for your interest.

          Show
          Daniel James Neades added a comment - I'll try to schedule some time for one of our engineers to contribute a test case with data. We're in the middle of a release cycle right now, so it might not be done straight away. Thank you for your interest.
          Hide
          James Synge added a comment -

          I spent much of last week tracking down a performance problem in an application that I'm working on, and it turned out to be described here.

          I developed a test application (that I will attach) that explores different ways of doing essentially this query:

          SELECT * FROM myTable WHERE foreignKeyColumn IN (?, ..., ?)

          We tried the following strategies:

          Literals - 1 query, using WHERE column IN ('literal1', ..., 'literalN')
          Literal - N queries, using WHERE column = 'literal[i]
          Markers - 1 query, using WHERE column IN (?, ..., ?)
          Marker - N queries, using WHERE column = ?
          TempTable - 1 query, store parameters in a temp table, use nested select, then delete parameters
          ScratchPad - 1 query, store parameters in a table, use nested select, then delete parameters
          ScrSavepoint - 1 query, set savepoint, store parameters in a table, use nested select, then rollback savepoint

          We were astonished to find that converting the query to:

          SELECT * FROM myTable WHERE foreignKeyColumn = ?

          And repeating that query N times was by far the best performer out of 7 different strategies I tried. (This is what I call the Marker strategy above.)

          Here are the results for a table with 100,000 rows in it, and then after that for 1,000,000 rows:
          (Note: this table is tab delimited, for easy importing into Excel. I'll also attach this data.)

          Literals Literal Markers Marker TempTable ScratchPad ScrSavepoint
          ID Count Total ms Avg ms Total ms Avg ms Total ms Avg ms Total ms Avg ms Total ms Avg ms Total ms Avg ms Total ms Avg ms
          1 20 20 10 10 10 10 0 0 1232 1232 1132 1132 1022 1022
          2 881 440 20 10 450 225 0 0 1022 511 1041 520 1042 521
          3 1051 350 30 10 2794 931 0 0 1022 340 1022 340 1012 337
          4 1012 253 40 10 2013 503 0 0 1032 258 1002 250 1202 300
          5 1132 226 40 8 2053 410 0 0 1032 206 1022 204 1042 208
          6 1042 173 50 8 1523 253 0 0 1012 168 1022 170 1051 175
          7 1132 161 60 8 3145 449 0 0 1022 146 1032 147 1112 158
          8 1102 137 60 7 3034 379 10 1 1062 132 1202 150 1082 135
          9 1102 122 60 6 2965 329 0 0 1142 126 1151 127 992 110
          10 1112 111 70 7 3526 352 0 0 1022 102 1052 105 1062 106
          20 1142 57 120 6 3746 187 10 0 1022 51 1112 55 1232 61
          30 1317 43 195 6 4117 137 10 0 1022 34 1082 36 1072 35
          40 1252 31 250 6 4417 110 20 0 1022 25 1091 27 1282 32
          50 1292 25 320 6 4777 95 20 0 1062 21 1052 21 1052 21
          60 1327 22 350 5 5068 84 20 0 1062 17 1082 18 1112 18
          70 1332 19 415 5 5504 78 30 0 1042 14 1142 16 1081 15
          80 1327 16 471 5 5769 72 40 0 1041 13 1052 13 1277 15
          90 1362 15 481 5 6330 70 40 0 1052 11 1152 12 1092 12
          100 1372 13 536 5 6460 64 40 0 1283 12 1092 10 1202 12

          =============================================================================================================================

          Literals Literal Markers Marker TempTable ScratchPad ScrSavepoint
          ID Count Total ms Avg ms Total ms Avg ms Total ms Avg ms Total ms Avg ms Total ms Avg ms Total ms Avg ms Total ms Avg ms
          1 160 160 70 70 40 40 41 41 69841 69841 44261 44261 36699 36699
          2 44120 22060 181 90 222124 111062 0 0 8624 4312 6851 3425 6580 3290
          3 10958 3652 120 40 12540 4180 10 3 6461 2153 6431 2143 6520 2173

          Show
          James Synge added a comment - I spent much of last week tracking down a performance problem in an application that I'm working on, and it turned out to be described here. I developed a test application (that I will attach) that explores different ways of doing essentially this query: SELECT * FROM myTable WHERE foreignKeyColumn IN (?, ..., ?) We tried the following strategies: Literals - 1 query, using WHERE column IN ('literal1', ..., 'literalN') Literal - N queries, using WHERE column = 'literal [i] Markers - 1 query, using WHERE column IN (?, ..., ?) Marker - N queries, using WHERE column = ? TempTable - 1 query, store parameters in a temp table, use nested select, then delete parameters ScratchPad - 1 query, store parameters in a table, use nested select, then delete parameters ScrSavepoint - 1 query, set savepoint, store parameters in a table, use nested select, then rollback savepoint We were astonished to find that converting the query to: SELECT * FROM myTable WHERE foreignKeyColumn = ? And repeating that query N times was by far the best performer out of 7 different strategies I tried. (This is what I call the Marker strategy above.) Here are the results for a table with 100,000 rows in it, and then after that for 1,000,000 rows: (Note: this table is tab delimited, for easy importing into Excel. I'll also attach this data.) Literals Literal Markers Marker TempTable ScratchPad ScrSavepoint ID Count Total ms Avg ms Total ms Avg ms Total ms Avg ms Total ms Avg ms Total ms Avg ms Total ms Avg ms Total ms Avg ms 1 20 20 10 10 10 10 0 0 1232 1232 1132 1132 1022 1022 2 881 440 20 10 450 225 0 0 1022 511 1041 520 1042 521 3 1051 350 30 10 2794 931 0 0 1022 340 1022 340 1012 337 4 1012 253 40 10 2013 503 0 0 1032 258 1002 250 1202 300 5 1132 226 40 8 2053 410 0 0 1032 206 1022 204 1042 208 6 1042 173 50 8 1523 253 0 0 1012 168 1022 170 1051 175 7 1132 161 60 8 3145 449 0 0 1022 146 1032 147 1112 158 8 1102 137 60 7 3034 379 10 1 1062 132 1202 150 1082 135 9 1102 122 60 6 2965 329 0 0 1142 126 1151 127 992 110 10 1112 111 70 7 3526 352 0 0 1022 102 1052 105 1062 106 20 1142 57 120 6 3746 187 10 0 1022 51 1112 55 1232 61 30 1317 43 195 6 4117 137 10 0 1022 34 1082 36 1072 35 40 1252 31 250 6 4417 110 20 0 1022 25 1091 27 1282 32 50 1292 25 320 6 4777 95 20 0 1062 21 1052 21 1052 21 60 1327 22 350 5 5068 84 20 0 1062 17 1082 18 1112 18 70 1332 19 415 5 5504 78 30 0 1042 14 1142 16 1081 15 80 1327 16 471 5 5769 72 40 0 1041 13 1052 13 1277 15 90 1362 15 481 5 6330 70 40 0 1052 11 1152 12 1092 12 100 1372 13 536 5 6460 64 40 0 1283 12 1092 10 1202 12 ============================================================================================================================= Literals Literal Markers Marker TempTable ScratchPad ScrSavepoint ID Count Total ms Avg ms Total ms Avg ms Total ms Avg ms Total ms Avg ms Total ms Avg ms Total ms Avg ms Total ms Avg ms 1 160 160 70 70 40 40 41 41 69841 69841 44261 44261 36699 36699 2 44120 22060 181 90 222124 111062 0 0 8624 4312 6851 3425 6580 3290 3 10958 3652 120 40 12540 4180 10 3 6461 2153 6431 2143 6520 2173
          Hide
          James Synge added a comment -

          Updated the performance test with 3 join queries, to compare with the 3 nested queries (the joins perform much better).

          Show
          James Synge added a comment - Updated the performance test with 3 join queries, to compare with the 3 nested queries (the joins perform much better).
          Hide
          Bryan Pendleton added a comment -

          Note that there is a Wiki page related to this issue: http://wiki.apache.org/db-derby/DerbyBug47

          Show
          Bryan Pendleton added a comment - Note that there is a Wiki page related to this issue: http://wiki.apache.org/db-derby/DerbyBug47
          Hide
          James Synge added a comment -

          I'm (finally) preparing to work on this issue. I have a general approach, which is to replace the IN LIST with an IN (sub-query-of-single-column-VirtualTable), but also some open questions.

          The basic question is "when" should I replace the IN predicate? The two basic choices would seem to be during preprocessing or during getNextDecoratedPermutation. The former would likely be simpler to implement, but would my force solution to always be used (which, given the poor cost estimates produced due to this bug, might be a good thing). The latter would require replacing the Optimizable against which the IN predicate is being evaluated with a substitute that represents a nested join (probably). I really don't have enough of a grasp of the sub-query optimization to know whether that is a good idea or not.

          The next question is how do I pass information to the VTI implementation regarding which parameters of the activation should it use as the values in the virtual table? I suspect that the constructor will need to take the range [N, N+M) of parameter indices, and also some array of literals; i.e. the IN list can consist of both parameter markers, and literal parameters. If the IN list were to include columns or functions, I suspect we'd want to skip this "solution" as it wouldn't support that.

          I assume that the activation object passed to the VTI constructor in VTIResultSet.opencore is where I would get the values for the parameters.

          Show
          James Synge added a comment - I'm (finally) preparing to work on this issue. I have a general approach, which is to replace the IN LIST with an IN (sub-query-of-single-column-VirtualTable), but also some open questions. The basic question is "when" should I replace the IN predicate? The two basic choices would seem to be during preprocessing or during getNextDecoratedPermutation. The former would likely be simpler to implement, but would my force solution to always be used (which, given the poor cost estimates produced due to this bug, might be a good thing). The latter would require replacing the Optimizable against which the IN predicate is being evaluated with a substitute that represents a nested join (probably). I really don't have enough of a grasp of the sub-query optimization to know whether that is a good idea or not. The next question is how do I pass information to the VTI implementation regarding which parameters of the activation should it use as the values in the virtual table? I suspect that the constructor will need to take the range [N, N+M) of parameter indices, and also some array of literals; i.e. the IN list can consist of both parameter markers, and literal parameters. If the IN list were to include columns or functions, I suspect we'd want to skip this "solution" as it wouldn't support that. I assume that the activation object passed to the VTI constructor in VTIResultSet.opencore is where I would get the values for the parameters.
          Hide
          James Synge added a comment -

          I'm currently looking to add the following if/else branch to InListOperatorNode:

          else if ((leftOperand instanceof ColumnReference) &&
          rightOperandList.containsAllConstantOrParameterNodes())

          { // TODO Create nodes representing a sub-select // SubqueryNode // SelectNode // FromList // FromVTI // (include the correlation name?) // NewInvocationNode // JSQLType (org.derby...InListAsTableVTI) // ResultColumnList // ResultColumn // (don't need a where clause) // // Issue: how do I assign the table number for the FromVTI? // Issue: how do I get binding done for this new SelectNode? // (this is obviously trivial, but needs to be done) return this; }
          Show
          James Synge added a comment - I'm currently looking to add the following if/else branch to InListOperatorNode: else if ((leftOperand instanceof ColumnReference) && rightOperandList.containsAllConstantOrParameterNodes()) { // TODO Create nodes representing a sub-select // SubqueryNode // SelectNode // FromList // FromVTI // (include the correlation name?) // NewInvocationNode // JSQLType (org.derby...InListAsTableVTI) // ResultColumnList // ResultColumn // (don't need a where clause) // // Issue: how do I assign the table number for the FromVTI? // Issue: how do I get binding done for this new SelectNode? // (this is obviously trivial, but needs to be done) return this; }
          Hide
          Daniel John Debrunner added a comment -

          Are there possible existing temporary table mechanisms that could be used instead of relying on VTIs?
          I would have thought there are existing cases where a temporary table is built during a query execution and used in subsequent joins.

          The concern I have over VTIs is type conversion, VTI's return rows as JDBC types, but the input to the IN list is in terms of internal DataValueDescriptors. If the IN list values could be kept as internal types then I think the solution would be easier.

          Of course this could be implemented using VTIs and subsequently improved to use an internal table type (language ResultSet implementation), but I wonder if that approach will lead to unnecessary work.

          Even if an existing table type is not used, I wonder if it's easier to provide an implementation of Derby's language ResultSet wrapped around a collection of values instead of the JDBC ResultSet doing the same work. Of course with a new language ResultSet there may be more optimization work that one gets for free with a VTI.

          language ResultSet = org.apache.derby.iapi.sql.ResultSet

          Show
          Daniel John Debrunner added a comment - Are there possible existing temporary table mechanisms that could be used instead of relying on VTIs? I would have thought there are existing cases where a temporary table is built during a query execution and used in subsequent joins. The concern I have over VTIs is type conversion, VTI's return rows as JDBC types, but the input to the IN list is in terms of internal DataValueDescriptors. If the IN list values could be kept as internal types then I think the solution would be easier. Of course this could be implemented using VTIs and subsequently improved to use an internal table type (language ResultSet implementation), but I wonder if that approach will lead to unnecessary work. Even if an existing table type is not used, I wonder if it's easier to provide an implementation of Derby's language ResultSet wrapped around a collection of values instead of the JDBC ResultSet doing the same work. Of course with a new language ResultSet there may be more optimization work that one gets for free with a VTI. language ResultSet = org.apache.derby.iapi.sql.ResultSet
          Hide
          Daniel John Debrunner added a comment -

          In DERBY-2152 James wrote:
          --------------------
          I'm experimenting with transforming a query such as:

          SELECT * FROM tableA WHERE columnB IN (constC, ?, ?, constD)

          into

          SELECT * FROM tableA WHERE columnB IN (SELECT vti.column1 FROM new ArgsToRowsVTI(SomeArgs))
          ----------------------

          Is this an step to solving DERBY-47, or does re-writing the query this way solve the performance problem?
          I'm confused because the new query still uses an IN operator and i don't see any queries like the above as performance experiments (e..g. using another table instead of a vti).

          I was thinking (along with the last comment), if there was a way to re-write the query using something along the lines of a VALUES clause.

          SELECT * FROM tableA TABLE(VALUES constC, ?, ?, constD) AS X(C) where X.C = tableA.columnB = X.C;

          no idea how that would perform, it also needs tweaking to remove duplicates from the list of values in the VALUES clause.

          Show
          Daniel John Debrunner added a comment - In DERBY-2152 James wrote: -------------------- I'm experimenting with transforming a query such as: SELECT * FROM tableA WHERE columnB IN (constC, ?, ?, constD) into SELECT * FROM tableA WHERE columnB IN (SELECT vti.column1 FROM new ArgsToRowsVTI(SomeArgs)) ---------------------- Is this an step to solving DERBY-47 , or does re-writing the query this way solve the performance problem? I'm confused because the new query still uses an IN operator and i don't see any queries like the above as performance experiments (e..g. using another table instead of a vti). I was thinking (along with the last comment), if there was a way to re-write the query using something along the lines of a VALUES clause. SELECT * FROM tableA TABLE(VALUES constC, ?, ?, constD) AS X(C) where X.C = tableA.columnB = X.C; no idea how that would perform, it also needs tweaking to remove duplicates from the list of values in the VALUES clause.
          Hide
          Daniel John Debrunner added a comment -

          > The next question is how do I pass information to the VTI implementation regarding which parameters of the activation should it use > as the values in the virtual table? I suspect that the constructor will need to take the range [N, N+M) of parameter indices, and also
          > some array of literals; i.e. the IN list can consist of both parameter markers, and literal parameters. If the IN list were to include
          > columns or functions, I suspect we'd want to skip this "solution" as it wouldn't support that.

          Based on past experience it would be good to avoid generating code per literal value, as that's an easy way to hit limits in the compiled byte code. I think you should be able to build up a collection of literal values (DataValueDescriptors) at compile time and store it as a saved object. The runtime ResultSet can then obtain the same collection from the saved objects.
          See CompilerContext.addSavedObject

          I think you have to do the same approach for the parameters, build up a collection of valid parameters, though maybe if you limit your solution you can get away with a range. A range does not work in the general case where a single value in the IN list can be composed of multiple parameters, e.g.
          IN (?,?,?, ?+?)

          With these approaches the amount of information required to create a ResultSet for such an IN list is fixed and not a factor of the number of values, thus leading to smaller generated code to generate the result set.

          Show
          Daniel John Debrunner added a comment - > The next question is how do I pass information to the VTI implementation regarding which parameters of the activation should it use > as the values in the virtual table? I suspect that the constructor will need to take the range [N, N+M) of parameter indices, and also > some array of literals; i.e. the IN list can consist of both parameter markers, and literal parameters. If the IN list were to include > columns or functions, I suspect we'd want to skip this "solution" as it wouldn't support that. Based on past experience it would be good to avoid generating code per literal value, as that's an easy way to hit limits in the compiled byte code. I think you should be able to build up a collection of literal values (DataValueDescriptors) at compile time and store it as a saved object. The runtime ResultSet can then obtain the same collection from the saved objects. See CompilerContext.addSavedObject I think you have to do the same approach for the parameters, build up a collection of valid parameters, though maybe if you limit your solution you can get away with a range. A range does not work in the general case where a single value in the IN list can be composed of multiple parameters, e.g. IN (?,?,?, ?+?) With these approaches the amount of information required to create a ResultSet for such an IN list is fixed and not a factor of the number of values, thus leading to smaller generated code to generate the result set.
          Hide
          James Synge added a comment -

          Dan wrote:
          > Is this an step to solving DERBY-47, or does re-writing the query this way solve the performance problem?

          The goal of the re-write is to avoid the source of DERBY-47, which is that we don't have a multi-probe index scan, only range index scans, and worse still, when there are parameters involved, the range is the entire index.

          My intent is that the re-write will send it towards treating this like an exists; for example:

          SELECT * FROM tableA
          WHERE EXISTS (SELECT 1 FROM new ArgsToRowsVTI(SomeArgs) AS X(C) WHERE tableA.columnB = X.C)

          > I was thinking (along with the last comment), if there was a way to re-write the query
          > using something along the lines of a VALUES clause.
          > SELECT * FROM tableA TABLE(VALUES constC, ?, ?, constD) AS X(C) where X.C = tableA.columnB = X.C;

          I tried a similar approach earlier, and found that it wouldn't compile. Perhaps it was the version of Derby
          I was testing with, or perhaps I just screwed up the syntax, but putting parameters into the FROM
          clause didn't seem to be allowed.

          I'm also concerned about the implications of moving the IN list to the FROM clause. Isn't there a risk of changing the semantics?

          Show
          James Synge added a comment - Dan wrote: > Is this an step to solving DERBY-47 , or does re-writing the query this way solve the performance problem? The goal of the re-write is to avoid the source of DERBY-47 , which is that we don't have a multi-probe index scan, only range index scans, and worse still, when there are parameters involved, the range is the entire index. My intent is that the re-write will send it towards treating this like an exists; for example: SELECT * FROM tableA WHERE EXISTS (SELECT 1 FROM new ArgsToRowsVTI(SomeArgs) AS X(C) WHERE tableA.columnB = X.C) > I was thinking (along with the last comment), if there was a way to re-write the query > using something along the lines of a VALUES clause. > SELECT * FROM tableA TABLE(VALUES constC, ?, ?, constD) AS X(C) where X.C = tableA.columnB = X.C; I tried a similar approach earlier, and found that it wouldn't compile. Perhaps it was the version of Derby I was testing with, or perhaps I just screwed up the syntax, but putting parameters into the FROM clause didn't seem to be allowed. I'm also concerned about the implications of moving the IN list to the FROM clause. Isn't there a risk of changing the semantics?
          Hide
          James Synge added a comment -

          Dan wrote:
          > ... store it as a saved object.

          Thanks for the pointer to CompilerContext.addSavedObject, I'd not come across it before.

          Do you think that I could "just" pass InListOperatorNode.rightOperandList (a ValueNodeList) to addSavedObject? If so, then the whole process could be much simpler.

          Show
          James Synge added a comment - Dan wrote: > ... store it as a saved object. Thanks for the pointer to CompilerContext.addSavedObject, I'd not come across it before. Do you think that I could "just" pass InListOperatorNode.rightOperandList (a ValueNodeList) to addSavedObject? If so, then the whole process could be much simpler.
          Hide
          Daniel John Debrunner added a comment -

          > Do you think that I could "just" pass InListOperatorNode.rightOperandList (a ValueNodeList) to addSavedObject?

          No. QueryTreeNodes are for compile time and are assumed to be private to a connection. Any saved objects and the entire compiled plan is shared across multiple connections. Holding onto a node in the compiled plan would lead to memory leaks as the nodes contain references to objects of its connection, which could not be gc'ed once the connection is closed if a compiled plan still had references to the nodes.

          Show
          Daniel John Debrunner added a comment - > Do you think that I could "just" pass InListOperatorNode.rightOperandList (a ValueNodeList) to addSavedObject? No. QueryTreeNodes are for compile time and are assumed to be private to a connection. Any saved objects and the entire compiled plan is shared across multiple connections. Holding onto a node in the compiled plan would lead to memory leaks as the nodes contain references to objects of its connection, which could not be gc'ed once the connection is closed if a compiled plan still had references to the nodes.
          Hide
          James Synge added a comment -

          I think I've found the code that caused a problem for me when I tried using VALUES to construct a table using parameters.
          SubqueryNode.bindExpression contains the following:

          /* reject ? parameters in the select list of subqueries */
          resultSet.rejectParameters();

          The syntax I'd used involved replacing an IN list with an IN sub-query, and clearly that would run into the above code.

          So, any idea why that code is there?

          Show
          James Synge added a comment - I think I've found the code that caused a problem for me when I tried using VALUES to construct a table using parameters. SubqueryNode.bindExpression contains the following: /* reject ? parameters in the select list of subqueries */ resultSet.rejectParameters(); The syntax I'd used involved replacing an IN list with an IN sub-query, and clearly that would run into the above code. So, any idea why that code is there?
          Hide
          James Synge added a comment -

          I've changed InListOperatorNode.preprocess to replace the node with a new SubqueryNode
          in the case where the list is all parameter markers. I'll attach the changed file.

          To make the SubqueryNode ready for use I need to call bindExpression and preprocess,
          and this is failing in bindExpression because it is attempting to rebind the leftOperand
          (a ColumnReference) when I don't have the full context of the original binding.

          I would seem to have a few choices:

          1) Modify ColumnReference.bindExpression so that it doesn't detects the rebinding
          situation, and returns without doing anything. I don't know if there are cases where
          rebinding occurs currently, and is supposed to change the binding. Besides that
          uncertainty, this seems like a good choice.

          2) Modify SubqueryNode.bindExpression to detect when the leftOperand is a
          ColumnReference that is already bound, and skip the leftOperand.bindExpression call.

          3) Copy much of the code from SubqueryNode.bindExpression into
          InListOperatorNode.preprocess, allowing me to bypass the call to
          leftOperand.bindExpression. Definitely ugly, hard to maintain.

          4) Create a trivial ColumnReference subclass that has an empty bindExpression
          implementation, and "clone" the InListOperatorNode.leftOperand as an instance
          of this subclass for use as the SubqueryNode.leftOperand , thus avoiding the
          problem.

          Any thoughts on the preferred approach?

          Show
          James Synge added a comment - I've changed InListOperatorNode.preprocess to replace the node with a new SubqueryNode in the case where the list is all parameter markers. I'll attach the changed file. To make the SubqueryNode ready for use I need to call bindExpression and preprocess, and this is failing in bindExpression because it is attempting to rebind the leftOperand (a ColumnReference) when I don't have the full context of the original binding. I would seem to have a few choices: 1) Modify ColumnReference.bindExpression so that it doesn't detects the rebinding situation, and returns without doing anything. I don't know if there are cases where rebinding occurs currently, and is supposed to change the binding. Besides that uncertainty, this seems like a good choice. 2) Modify SubqueryNode.bindExpression to detect when the leftOperand is a ColumnReference that is already bound, and skip the leftOperand.bindExpression call. 3) Copy much of the code from SubqueryNode.bindExpression into InListOperatorNode.preprocess, allowing me to bypass the call to leftOperand.bindExpression. Definitely ugly, hard to maintain. 4) Create a trivial ColumnReference subclass that has an empty bindExpression implementation, and "clone" the InListOperatorNode.leftOperand as an instance of this subclass for use as the SubqueryNode.leftOperand , thus avoiding the problem. Any thoughts on the preferred approach?
          Hide
          James Synge added a comment -

          Added code for replacing the InListOperatorNode with a SubqueryNode (IN_SUBQUERY) in the case where all of the list entries are parameter markers. Not yet working (fails in SubqueryNode.bindExpression, calling leftOperand.bindExpression).

          Show
          James Synge added a comment - Added code for replacing the InListOperatorNode with a SubqueryNode (IN_SUBQUERY) in the case where all of the list entries are parameter markers. Not yet working (fails in SubqueryNode.bindExpression, calling leftOperand.bindExpression).
          Hide
          James Synge added a comment -

          I've been thinking more about the suggestion of using a table constructor.
          I was concerned about several issues:

          1) When I tried it previously I got this error:

          ERROR 42Y10: A table constructor that is not in an INSERT
          statement has all ? parameters in one of its columns. For
          each column, at least one of the rows must have a non-parameter.

          Working around this will require figuring out the type of the parameters,
          and creating a dummy value to include in the table, with extra code to
          remove it.

          2) Re-writing the query by adding the table constructor to the fromList
          of the SelectNode that includes the IN list predicate potentially
          changes the semantics of the query (for example, imagine the case
          where the IN list is OR'd with other predicates). This is why I've been
          sticking with a sub-query (which my performance experiments indicated
          can perform well).

          I tried the table constructor again this morning, coming up with a query
          re-write that combines the suggested table constructor with an IN sub-query.
          For example, we would transform:

          SELECT * FROM tableA
          WHERE columnB IN (constC, ?, ?, constD)

          into:

          SELECT * FROM tableA
          WHERE columnB IN (
          SELECT v FROM TABLE(
          VALUES (dummyValue, 0), (constC, 1), (?, 1), (?, 1), (constD, 1)) AS X(v, k) WHERE k=1)

          The sub-query is built up as a tree (really a list) of UNIONs, union-ing together each
          row in the VALUES clause.

          Unfortunately the performance of this is poor, with a table scan of tableA being performed,
          rather than using the index on columnB. I'm investigating why this isn't transformed
          into a nested loop join, with the UNION on the outside.

          Show
          James Synge added a comment - I've been thinking more about the suggestion of using a table constructor. I was concerned about several issues: 1) When I tried it previously I got this error: ERROR 42Y10: A table constructor that is not in an INSERT statement has all ? parameters in one of its columns. For each column, at least one of the rows must have a non-parameter. Working around this will require figuring out the type of the parameters, and creating a dummy value to include in the table, with extra code to remove it. 2) Re-writing the query by adding the table constructor to the fromList of the SelectNode that includes the IN list predicate potentially changes the semantics of the query (for example, imagine the case where the IN list is OR'd with other predicates). This is why I've been sticking with a sub-query (which my performance experiments indicated can perform well). I tried the table constructor again this morning, coming up with a query re-write that combines the suggested table constructor with an IN sub-query. For example, we would transform: SELECT * FROM tableA WHERE columnB IN (constC, ?, ?, constD) into: SELECT * FROM tableA WHERE columnB IN ( SELECT v FROM TABLE( VALUES (dummyValue, 0), (constC, 1), (?, 1), (?, 1), (constD, 1)) AS X(v, k) WHERE k=1) The sub-query is built up as a tree (really a list) of UNIONs, union-ing together each row in the VALUES clause. Unfortunately the performance of this is poor, with a table scan of tableA being performed, rather than using the index on columnB. I'm investigating why this isn't transformed into a nested loop join, with the UNION on the outside.
          Hide
          James Synge added a comment -

          As stated in my previous comment, I've been debugging the compilation of this query:

          SELECT * FROM tableA
          WHERE columnB IN (
          SELECT v FROM TABLE(
          VALUES (dummyValue, 0), (constC, 1), (?, 1), (?, 1), (constD, 1)) AS X(v, k) WHERE k=1)

          The IN operator is represented with a SubqueryNode of type IN_SUBQUERY.
          SubqueryNode.preprocess "tries" to flatten this, but because the rightOperand
          is doesn't select from a single base table, it decides not to flatten the sub-query
          (i.e. it doesn't hoist it up into its parent query).

          As a result, the optimizer isn't presented with a join that it can choose the order of,
          but rather just one top-level optimizable, with a sub-query that must be optimized
          "separately".

          Show
          James Synge added a comment - As stated in my previous comment, I've been debugging the compilation of this query: SELECT * FROM tableA WHERE columnB IN ( SELECT v FROM TABLE( VALUES (dummyValue, 0), (constC, 1), (?, 1), (?, 1), (constD, 1)) AS X(v, k) WHERE k=1) The IN operator is represented with a SubqueryNode of type IN_SUBQUERY. SubqueryNode.preprocess "tries" to flatten this, but because the rightOperand is doesn't select from a single base table, it decides not to flatten the sub-query (i.e. it doesn't hoist it up into its parent query). As a result, the optimizer isn't presented with a join that it can choose the order of, but rather just one top-level optimizable, with a sub-query that must be optimized "separately".
          Hide
          James Synge added a comment -

          Some success at last...

          I forced SubqueryNode.preprocess to flatten the sub-query mentioned
          previously by modifying SubqueryNode.singleFromBaseTable to return
          true (not a valid change, but it helps with this experiment), which then
          enabled the optimizer to re-order the join so that the index on tableA
          is used. Yeah!

          The performance was quite reasonable.

          Possible next steps:

          • Consider wrapping the sub-query's SelectNode in a MaterializeResultSetNode (helpful in those cases where the sub-query is invariant, and the sub-query ends up being executed multiple times... that latter is not really known until later in the optimization process)
          • Modify SubqueryNode.preprocess so that it can flatten more sub-queries (including the one produced above). Materializing the sub-query could help with the flattening.
          • Modify InListOperatorNode.preprocess to produce a sub-query like the one above.
          Show
          James Synge added a comment - Some success at last... I forced SubqueryNode.preprocess to flatten the sub-query mentioned previously by modifying SubqueryNode.singleFromBaseTable to return true (not a valid change, but it helps with this experiment), which then enabled the optimizer to re-order the join so that the index on tableA is used. Yeah! The performance was quite reasonable. Possible next steps: Consider wrapping the sub-query's SelectNode in a MaterializeResultSetNode (helpful in those cases where the sub-query is invariant, and the sub-query ends up being executed multiple times... that latter is not really known until later in the optimization process) Modify SubqueryNode.preprocess so that it can flatten more sub-queries (including the one produced above). Materializing the sub-query could help with the flattening. Modify InListOperatorNode.preprocess to produce a sub-query like the one above.
          Hide
          A B added a comment -

          I've spent some time looking at this issue and have come up with a solution that is based on two earlier comments for this issue. Namely:

          Comment #1:

          > I was thinking if there was a way to re-write the query using something
          > along the lines of a VALUES clause.
          >
          > SELECT * FROM tableA TABLE(VALUES constC, ?, ?, constD) AS X(C)
          > where X.C = tableA.columnB = X.C;
          >
          > no idea how that would perform, it also needs tweaking to remove duplicates
          > from the list of values in the VALUES clause.

          The solution I've come up with does in fact rewrite IN-list queries to create a join between the target table and "something along the lines of a VALUES clause". That said, though, a VALUES expression in Derby is internally parsed into a chain of nested UNIONs between RowResultSetNodes. I did a quick prototype to duplicate that behavior and found that for IN lists with thousands of values in them, a chain of UNIONs is not acceptable. More specifically, I found that such a chain inevitably leads to stack overflow because of recursion in preprocessing and/or optimization. And even if the list is small enough to avoid stack overflow, the time it takes to optimize a UNION chain with, say, a thousand values is far too long (I was seeing optimization times of over 20 seconds with 1000 IN-list values). And lastly, assuming that we were able to code around stack overflow and optimization time, the limit to the size of a UNION chain that Derby can successfully generate is far less than the current limit on IN-lists-see DERBY-1315 for the details-which means we would have to skip the rewrite when the list has more than xxx values. That's doable, but not pretty.

          So in my proposed solution we do not actually create a normal VALUEs expression. Instead we create a new type of node called an "InListResultSetNode" that is specifically designed to handle potentially large lists of constant and/or parameter values. Then we create a new equality predicate to join the InListRSN with the appropriate table, and since InListResultSetNode is an instance of Optimizable, the optimizer can then use the predicate to optimize the join.

          Note that I also made changes to the optimizer so that it recognizes InListResultSetNodes as "fixed outer tables", meaning that they are prepended to the join order and remain fixed at their prepended positions throughout the optimization process. This ensures that the optimizer does not have to deal with additional join orders as a result of the IN-list rewrite (it already looks at too many; see DERBY-1907).

          Comment #2:

          > Based on past experience it would be good to avoid generating code per
          > literal value, as that's an easy way to hit limits in the compiled byte
          > code. I think you should be able to build up a collection of literal values
          > (DataValueDescriptors) at compile time and store it as a saved object. The
          > runtime ResultSet can then obtain the same collection from the saved objects.
          > See CompilerContext.addSavedObject

          In the solution I've written we do in fact create a "collection of literal values (DataValueDescriptors) at compile time and store it". We do not, however, use saved objects to store/retrieve them. Instead, we use existing code in InListOperatorNode to create an execution time array of DataValueDescriptors and we pass that array to a new execution time result set, InListResultSet, that returns the list of values as a single-column result set. This approach works for both constants and parameters.

          By re-using the existing InListOperator code generation we ensure that the max size of an IN-list after these changes will be the same as what it was before these changes.


          All of that said, the internal addition of a new InListResultSetNode to a FromList is not without its side effects. The following are the three areas in which I've noticed an unintended change in behavior caused by adding an InListRSN to a FromList. I have a "progress-not-perfection" workaround for the first issue; I still need to investigate the latter two (any input from others would of course be much appreciated).

          1) Derby has internal restrictions on the types of queries that it can flatten. Most notably (and as James Synge also noted) a subquery cannot be flattened if its FromList contains more than a single FromTable. When rewriting the IN-list operator as described above, we add the new inListRSN to the appropriate FromList. That means that the FromList will now have more than one FromTable and thus the subquery is no longer flattenable. So for the following query:

          select * from t2, (select * from t1 where i in (1, 2, 3)) x

          the inner select is currently flattened during preprocessing. With the changes as I've described them, though, the IN list for the subquery would become an InListResultSetNode that is added to the inner FromList, thus rendering the inner query non-flattenable. In the interest of making progress (instead of perfection) on this issue, I simply added logic to skip rewriting an IN-list if appears in a subquery. In that case we will default to normal IN-list processing as it is today.

          2) I just discovered this morning that the addition of an InListResultSet to the FromList causes all of the SUR (scrollable updatable result set) tests that use an IN-clause to fail--apparently the presence of the InListResultSet results in a downgrade of the cursor scrollability to "CONCUR_READ_ONLY". I do not yet know why this is the case, nor do I know how to prevent it.

          3) The store/readlocks.sql test fails with the proposed changes because of missing ROW locks. I do not know if these missing locks are a problem or just a side effect that can be "fixed" with a master update. More investigation required...

          There were of course other tests that failed with row orderings and/or different plans, but so far as I can tell all of those are expected and correct--so I will just update the master files as needed.

          For now I have just attached an initial version of the engine changes, d47_engine_doNotCommit_v1.patch, for general review/comments if anyone has any. I plan to look into the above issues more and will probably go to derby-dev with questions where needed. In the meantime, any feedback on the general approach as outlined above would be appreciated.

          Oh, and by the way: I ran Derby47PerformanceTest.java (as attached to this issue) with 100,000 rows after applying this patch. Whereas the "Markers" strategy was by far the worse query before my changes, it ends up being the best strategy after my changes, beating out the "Marker" and "JoinTemp" strategies (which were previously the best) by roughly 30 and 25 percent, respectively.

          Show
          A B added a comment - I've spent some time looking at this issue and have come up with a solution that is based on two earlier comments for this issue. Namely: Comment #1: > I was thinking if there was a way to re-write the query using something > along the lines of a VALUES clause. > > SELECT * FROM tableA TABLE(VALUES constC, ?, ?, constD) AS X(C) > where X.C = tableA.columnB = X.C; > > no idea how that would perform, it also needs tweaking to remove duplicates > from the list of values in the VALUES clause. The solution I've come up with does in fact rewrite IN-list queries to create a join between the target table and "something along the lines of a VALUES clause". That said, though, a VALUES expression in Derby is internally parsed into a chain of nested UNIONs between RowResultSetNodes. I did a quick prototype to duplicate that behavior and found that for IN lists with thousands of values in them, a chain of UNIONs is not acceptable. More specifically, I found that such a chain inevitably leads to stack overflow because of recursion in preprocessing and/or optimization. And even if the list is small enough to avoid stack overflow, the time it takes to optimize a UNION chain with, say, a thousand values is far too long (I was seeing optimization times of over 20 seconds with 1000 IN-list values). And lastly, assuming that we were able to code around stack overflow and optimization time, the limit to the size of a UNION chain that Derby can successfully generate is far less than the current limit on IN-lists- see DERBY-1315 for the details -which means we would have to skip the rewrite when the list has more than xxx values. That's doable, but not pretty. So in my proposed solution we do not actually create a normal VALUEs expression. Instead we create a new type of node called an "InListResultSetNode" that is specifically designed to handle potentially large lists of constant and/or parameter values. Then we create a new equality predicate to join the InListRSN with the appropriate table, and since InListResultSetNode is an instance of Optimizable, the optimizer can then use the predicate to optimize the join. Note that I also made changes to the optimizer so that it recognizes InListResultSetNodes as "fixed outer tables", meaning that they are prepended to the join order and remain fixed at their prepended positions throughout the optimization process. This ensures that the optimizer does not have to deal with additional join orders as a result of the IN-list rewrite (it already looks at too many; see DERBY-1907 ). Comment #2: > Based on past experience it would be good to avoid generating code per > literal value, as that's an easy way to hit limits in the compiled byte > code. I think you should be able to build up a collection of literal values > (DataValueDescriptors) at compile time and store it as a saved object. The > runtime ResultSet can then obtain the same collection from the saved objects. > See CompilerContext.addSavedObject In the solution I've written we do in fact create a "collection of literal values (DataValueDescriptors) at compile time and store it". We do not, however, use saved objects to store/retrieve them. Instead, we use existing code in InListOperatorNode to create an execution time array of DataValueDescriptors and we pass that array to a new execution time result set, InListResultSet, that returns the list of values as a single-column result set. This approach works for both constants and parameters. By re-using the existing InListOperator code generation we ensure that the max size of an IN-list after these changes will be the same as what it was before these changes. All of that said, the internal addition of a new InListResultSetNode to a FromList is not without its side effects. The following are the three areas in which I've noticed an unintended change in behavior caused by adding an InListRSN to a FromList. I have a "progress-not-perfection" workaround for the first issue; I still need to investigate the latter two (any input from others would of course be much appreciated). 1) Derby has internal restrictions on the types of queries that it can flatten. Most notably (and as James Synge also noted) a subquery cannot be flattened if its FromList contains more than a single FromTable. When rewriting the IN-list operator as described above, we add the new inListRSN to the appropriate FromList. That means that the FromList will now have more than one FromTable and thus the subquery is no longer flattenable. So for the following query: select * from t2, (select * from t1 where i in (1, 2, 3)) x the inner select is currently flattened during preprocessing. With the changes as I've described them, though, the IN list for the subquery would become an InListResultSetNode that is added to the inner FromList, thus rendering the inner query non-flattenable. In the interest of making progress (instead of perfection) on this issue, I simply added logic to skip rewriting an IN-list if appears in a subquery. In that case we will default to normal IN-list processing as it is today. 2) I just discovered this morning that the addition of an InListResultSet to the FromList causes all of the SUR (scrollable updatable result set) tests that use an IN-clause to fail--apparently the presence of the InListResultSet results in a downgrade of the cursor scrollability to "CONCUR_READ_ONLY". I do not yet know why this is the case, nor do I know how to prevent it. 3) The store/readlocks.sql test fails with the proposed changes because of missing ROW locks. I do not know if these missing locks are a problem or just a side effect that can be "fixed" with a master update. More investigation required... There were of course other tests that failed with row orderings and/or different plans, but so far as I can tell all of those are expected and correct--so I will just update the master files as needed. For now I have just attached an initial version of the engine changes, d47_engine_doNotCommit_v1.patch, for general review/comments if anyone has any. I plan to look into the above issues more and will probably go to derby-dev with questions where needed. In the meantime, any feedback on the general approach as outlined above would be appreciated. Oh, and by the way: I ran Derby47PerformanceTest.java (as attached to this issue) with 100,000 rows after applying this patch. Whereas the "Markers" strategy was by far the worse query before my changes, it ends up being the best strategy after my changes, beating out the "Marker" and "JoinTemp" strategies (which were previously the best) by roughly 30 and 25 percent, respectively.
          Hide
          A B added a comment -

          Accidentally attached the _v1 files without granting license to ASF. So I'm reattaching with the correct "Attachment license" option.

          Show
          A B added a comment - Accidentally attached the _v1 files without granting license to ASF. So I'm reattaching with the correct "Attachment license" option.
          Hide
          A B added a comment -

          Attaching the diff seen with store/readlocks.sql when d47_engine_doNotCommit_v1.patch is applied. This is the actual diff produced from the test; I'll try to post a modified diff that includes the relevant queries tomorrow, to (hopefully) aid in the determination of whether or not this diff is acceptable...

          Show
          A B added a comment - Attaching the diff seen with store/readlocks.sql when d47_engine_doNotCommit_v1.patch is applied. This is the actual diff produced from the test; I'll try to post a modified diff that includes the relevant queries tomorrow, to (hopefully) aid in the determination of whether or not this diff is acceptable...
          Hide
          A B added a comment -

          Attaching another readlocks diff, this time with more context so that the queries in question can be seen...

          Show
          A B added a comment - Attaching another readlocks diff, this time with more context so that the queries in question can be seen...
          Hide
          A B added a comment -

          In one of my previous comments I mentioned that d47_engine_doNOTCommit_v1.patch has a problem where the addition of an InListResultSet to the FromList causes all of the SUR (Scrollable Updatable Result set) tests that use an IN-clause to fail. More specifically, the presence of the InListResultSet causes Derby to downgrade the cursor scrollability to "CONCUR_READ_ONLY".

          I was eventually able to track down the cause of this behavior: whether or not Derby downgrades a result set to CONCUR_READ_ONLY depends on the value returned by the execution time result set in the "isForUpdate()" method. The default (as defined in NoPutResultSetImpl) is to return false.

          In the case of the SUR tests (prior to my changes) we were getting a ScrollInsensitiveResultSet on top of a TableScanResultSet. The former gets its "isForUpdate()" value from the latter, and the latter correctly returns "true" to indicate that the result set is for update and thus no downgrade is needed. With d47_engine_doNotCommit_v1.patch applied, though, we add an InListResultSet to the FromList, which ultimately gives us a JoinResultSet at execution time. The JoinResultSet class does not define an "isForUpdate()" method, so we just return the default--which is "false". That causes the cursor concurrency to be downgraded to CONCUR_READ_ONLY.

          To see what would happen I forced JoinResultSet to return "true" and then there was an ASSERT failure because JoinResultSets are not expected to be used for update/delete. The relevant code is in execute/JoinResultSet.java; in the "getRowLocation()" method we see the following comment:

          • A join is combining rows from two sources, so it has no
          • single row location to return; just return a null.

          My guess is that, since the decision to disallow update/delete on a JoinResultSet was intentional, trying to code around that restriction is a bad idea. Or at the very least, it would require a lot more investigation and/or work.

          Instead of pursuing that potentially treacherous path, I was able to come up with some logic that checks to see if a result set is updatable at compile time and, if so, to skip the IN-list rewrite. Early testing suggests that this is viable solution.

          HOWEVER, as the list of "exceptions" to the IN-list (_v1.patch) rewrite grew, I started to wonder if there wasn't some other solution to DERBY-47 that would accomplish the same thing, minus all of the exception cases.

          A few days later I came across some queries involving multiple IN-list predicates for a single SELECT statement. It turns out that many of those queries return incorrect results (duplicate rows) and/or run much more slowly with the _v1 patch than without.

          The fact that there are so many "exceptions to the rule" combined with the undesirable behavior in the face of multiple IN-lists prompted me to abandon my initial idea of internally adding an InListResultSetNode to the user's FROM list (d47_engine_doNOTCommit_v1.patch).

          Instead I have been working on an alternate approach to the problem. This second approach, like the first, is based on a comment made by someone else on this issue. This time, though, the comment in question is from James Synge and is as follows:

          > The goal of the re-write is to avoid the source of DERBY-47,
          > which is that we don't have a multi-probe index scan, only
          > range index scans, and worse still, when there are parameters
          > involved, the range is the entire index.

          In short, I decided to work with the execution-time result sets to see if it is possible to enforce some kind of "multi-probing" for indexes. That is to say, instead of scanning a range of values on the index, we want to make it so that Derby probes the index N times, where N is the size of the IN-list. For each probe, then, we'll get all rows for which the target column's value equals the N-th value in the IN-list.

          Effectively, this comes down to the "Marker" strategy in Derby47Performance.java, where we evaluate an equality predicate, "column = ?", N times. Except of course that, unlike with the Marker strategy, we do the N evaluations internally instead of making the user do it.

          A high-level description of how this approach will work is as follows:

          • If any value in the IN-list is not a constant AND is not a parameter,
            then do processing as usual (no optimizations, no rewrite). Notice
            that with this approach we do still perform the optimization if
            IN-list values are parameters. That is not the case for the current
            Derby rewrite (i.e. without any changes for DERBY-47).

          Otherwise:

          • During preprocessing, replace the IN-list with an equality predicate of
            the form "column = ?". I.e. the right operand will be a parameter node,
            the left operand will be whatever column reference belongs to the IN-list
            (hereafter referred to as the "target column"). We call this new, internal
            predicate an IN-list "probe predicate" because it will be the basis of
            the multi-probing logic at execution time.
          • During costing, the equality predicate "column = ?" will be treated as
            a start/stop key by normal optimizer processing (no changes necessary).
            If the predicate is deemed a start/stop key for the first column in
            an index, then we'll multiply the estimated cost of "column = ?" by
            the size of the IN-list, to account for the fact that we're actually
            evaluating the predicate N times (not just once).

          If the predicate is not a start/stop key for the first column in
          the index, then we do not adjust the cost. See below for why.

          • After we have found the best plan (including a best conglomerate choice),
            check to see if the IN-list probe predicate is a start/stop key on the
            first column in the chosen conglomerate. In order for that to be the
            case the conglomerate must be an index (because we don't have start/stop
            keys on table conglomerates). If the probe predicate is not a
            start/stop key for the first column in the index, then we "revert" the probe
            predicate back to its original IN-list form and evaluate it as a "normal"
            InListOperatorNode. In this way we are effectively "giving up" on the multi-
            probing approach. This is why we don't change the cost for such probe
            predicates (mentioned above).

          If the probe predicate is a start/stop key on the first column in
          the index conglomerate, then leave it as it is.

          • When it comes time to generate the byte code, look to see if we have
            a probe predicate that is a start/stop key on the first column in the
            chosen conglomerate. If so, generate an array of DataValueDescriptors
            to hold the values from the corresponding IN-list and pass that array
            to the underlying execution-time result set (i.e. to TableScanResultSet).
            Then generate the probe predicate as a "normal" start/stop key for the
            scan. This will serve as a "place-holder" for the IN-list values
            at execution time.
          • Finally, at execution time, instead of using a single start key and a
            single stop key to define a scan range, we iterate through the IN-list
            values and for each non-duplicate value V[i] (0 <= i < N) we treat V[i]
            as both a start key and a stop key. Or put another way, we plug
            V[i] into the "column = ?" predicate and retrieve all matching rows.
            In this way we are "probing" the index for all rows having value V[i]
            in the target column. Once all rows matching V[i] have been returned,
            we then grab the next IN-list value, V[i+1], and we reposition the
            scan (by calling "reopenCore()"), this time using V[i+1] as the start
            and stop key (i.e. plugging V[i+1] into "column = ?"). This will
            return all rows having value V[1+1] in the target column.

          Continue this process until all values in the IN-list have been
          exhausted. When we reach that point, we are done.

          As a simple example, assume our query is of the form:

          select ... from admin.changes where id in (1, 20000)

          During preprocessing we will effectively change this to be:

          select ... from admin.changes where id = ?

          where "id = ?" is our IN-list "probe predicate". Note that we must make sure the optimizer recognizes "id = ?" as a disguised IN-list operator, as opposed to a true relational operator. The reason is that, while we do treat the probe predicate as a "fake" relational operator so that it can be seen as a start/stop key for the relevant indexes, there are many operations (ex. transitive closure) that should not be done on a probe predicate. So we have to be able to distinguish a probe predicate from other relational predicates.

          With the probe predicate in place the optimizer will determine that it is a start/stop key for any index having column "ID" as the first column, and thus the optimizer is more likely to choose one of those indexes over a table scan. If we assume the optimizer decides to use an index on ID for the query, then we'll generate the IN-list values (1, 20000) and we will pass them to the underlying index scan. We'll then generate the probe predicate "column = ?" as a start and stop key for the scan.

          At execution, then, the scan will first open up the index by using "1" as the start and stop key for the scan (or put another way, by plugging "1" into the probe predicate, "column = ?"). That will return all rows having ID equal to "1". Then when that scan ends, we'll reopen the scan a second time, this time using "20000" as the start and stop key, therefore returning all the 20000's. Meanwhile, all result sets higher up in the result set tree will just see a stream of rows from the index where ID only equals 1 or 20000.

          This works for IN-lists with parameters, as well, because by the time execution starts we know what values the parameters hold.

          The first draft of code for implementing all of this is pretty much done, but I plan to post it in increments for ease of review. If all goes well, nothing should change functionally until the final patch, which will be the preprocessing patch that does the actual creation of the "probe predicate". At that point all of the machinery should be in place to recognize the probe predicate and function accordingly.

          I ran Derby47PerformanceTest with this "multi-probing" approach and the numbers were similar to what I saw with the _v1 approach (more details coming later). Unlike the _v1 approach, though, the multi-probing approach works with subqueries, left outer joins, and scrollable updatable result sets. In addition, all of the test queries that I came up with involving multiple IN-lists run correctly with the multi-probing approach, and in many cases run quite a bit more quickly--neither of which was true for the _v1 approach. And yes, I do hope to add those test cases to the nightlies as part of my work on this issue.

          Incremental patches are forthcoming. In the meantime, if anyone has any comments/suggestions on the approach outlined above, I would appreciated the feedback...

          Show
          A B added a comment - In one of my previous comments I mentioned that d47_engine_doNOTCommit_v1.patch has a problem where the addition of an InListResultSet to the FromList causes all of the SUR (Scrollable Updatable Result set) tests that use an IN-clause to fail. More specifically, the presence of the InListResultSet causes Derby to downgrade the cursor scrollability to "CONCUR_READ_ONLY". I was eventually able to track down the cause of this behavior: whether or not Derby downgrades a result set to CONCUR_READ_ONLY depends on the value returned by the execution time result set in the "isForUpdate()" method. The default (as defined in NoPutResultSetImpl) is to return false. In the case of the SUR tests (prior to my changes) we were getting a ScrollInsensitiveResultSet on top of a TableScanResultSet. The former gets its "isForUpdate()" value from the latter, and the latter correctly returns "true" to indicate that the result set is for update and thus no downgrade is needed. With d47_engine_doNotCommit_v1.patch applied, though, we add an InListResultSet to the FromList, which ultimately gives us a JoinResultSet at execution time. The JoinResultSet class does not define an "isForUpdate()" method, so we just return the default--which is "false". That causes the cursor concurrency to be downgraded to CONCUR_READ_ONLY. To see what would happen I forced JoinResultSet to return "true" and then there was an ASSERT failure because JoinResultSets are not expected to be used for update/delete. The relevant code is in execute/JoinResultSet.java; in the "getRowLocation()" method we see the following comment: A join is combining rows from two sources, so it has no single row location to return; just return a null. My guess is that, since the decision to disallow update/delete on a JoinResultSet was intentional, trying to code around that restriction is a bad idea. Or at the very least, it would require a lot more investigation and/or work. Instead of pursuing that potentially treacherous path, I was able to come up with some logic that checks to see if a result set is updatable at compile time and, if so, to skip the IN-list rewrite. Early testing suggests that this is viable solution. HOWEVER, as the list of "exceptions" to the IN-list (_v1.patch) rewrite grew, I started to wonder if there wasn't some other solution to DERBY-47 that would accomplish the same thing, minus all of the exception cases. A few days later I came across some queries involving multiple IN-list predicates for a single SELECT statement. It turns out that many of those queries return incorrect results (duplicate rows) and/or run much more slowly with the _v1 patch than without. The fact that there are so many "exceptions to the rule" combined with the undesirable behavior in the face of multiple IN-lists prompted me to abandon my initial idea of internally adding an InListResultSetNode to the user's FROM list (d47_engine_doNOTCommit_v1.patch). Instead I have been working on an alternate approach to the problem. This second approach, like the first, is based on a comment made by someone else on this issue. This time, though, the comment in question is from James Synge and is as follows: > The goal of the re-write is to avoid the source of DERBY-47 , > which is that we don't have a multi-probe index scan, only > range index scans, and worse still, when there are parameters > involved, the range is the entire index. In short, I decided to work with the execution-time result sets to see if it is possible to enforce some kind of "multi-probing" for indexes. That is to say, instead of scanning a range of values on the index, we want to make it so that Derby probes the index N times, where N is the size of the IN-list. For each probe, then, we'll get all rows for which the target column's value equals the N-th value in the IN-list. Effectively, this comes down to the "Marker" strategy in Derby47Performance.java, where we evaluate an equality predicate, "column = ?", N times. Except of course that, unlike with the Marker strategy, we do the N evaluations internally instead of making the user do it. A high-level description of how this approach will work is as follows: If any value in the IN-list is not a constant AND is not a parameter, then do processing as usual (no optimizations, no rewrite). Notice that with this approach we do still perform the optimization if IN-list values are parameters. That is not the case for the current Derby rewrite (i.e. without any changes for DERBY-47 ). Otherwise: During preprocessing, replace the IN-list with an equality predicate of the form "column = ?". I.e. the right operand will be a parameter node, the left operand will be whatever column reference belongs to the IN-list (hereafter referred to as the "target column"). We call this new, internal predicate an IN-list "probe predicate" because it will be the basis of the multi-probing logic at execution time. During costing, the equality predicate "column = ?" will be treated as a start/stop key by normal optimizer processing (no changes necessary). If the predicate is deemed a start/stop key for the first column in an index, then we'll multiply the estimated cost of "column = ?" by the size of the IN-list, to account for the fact that we're actually evaluating the predicate N times (not just once). If the predicate is not a start/stop key for the first column in the index, then we do not adjust the cost. See below for why. After we have found the best plan (including a best conglomerate choice), check to see if the IN-list probe predicate is a start/stop key on the first column in the chosen conglomerate. In order for that to be the case the conglomerate must be an index (because we don't have start/stop keys on table conglomerates). If the probe predicate is not a start/stop key for the first column in the index, then we "revert" the probe predicate back to its original IN-list form and evaluate it as a "normal" InListOperatorNode. In this way we are effectively "giving up" on the multi- probing approach. This is why we don't change the cost for such probe predicates (mentioned above). If the probe predicate is a start/stop key on the first column in the index conglomerate, then leave it as it is. When it comes time to generate the byte code, look to see if we have a probe predicate that is a start/stop key on the first column in the chosen conglomerate. If so, generate an array of DataValueDescriptors to hold the values from the corresponding IN-list and pass that array to the underlying execution-time result set (i.e. to TableScanResultSet). Then generate the probe predicate as a "normal" start/stop key for the scan. This will serve as a "place-holder" for the IN-list values at execution time. Finally, at execution time, instead of using a single start key and a single stop key to define a scan range, we iterate through the IN-list values and for each non-duplicate value V [i] (0 <= i < N) we treat V [i] as both a start key and a stop key. Or put another way, we plug V [i] into the "column = ?" predicate and retrieve all matching rows. In this way we are "probing" the index for all rows having value V [i] in the target column. Once all rows matching V [i] have been returned, we then grab the next IN-list value, V [i+1] , and we reposition the scan (by calling "reopenCore()"), this time using V [i+1] as the start and stop key (i.e. plugging V [i+1] into "column = ?"). This will return all rows having value V [1+1] in the target column. Continue this process until all values in the IN-list have been exhausted. When we reach that point, we are done. As a simple example, assume our query is of the form: select ... from admin.changes where id in (1, 20000) During preprocessing we will effectively change this to be: select ... from admin.changes where id = ? where "id = ?" is our IN-list "probe predicate". Note that we must make sure the optimizer recognizes "id = ?" as a disguised IN-list operator, as opposed to a true relational operator. The reason is that, while we do treat the probe predicate as a "fake" relational operator so that it can be seen as a start/stop key for the relevant indexes, there are many operations (ex. transitive closure) that should not be done on a probe predicate. So we have to be able to distinguish a probe predicate from other relational predicates. With the probe predicate in place the optimizer will determine that it is a start/stop key for any index having column "ID" as the first column, and thus the optimizer is more likely to choose one of those indexes over a table scan. If we assume the optimizer decides to use an index on ID for the query, then we'll generate the IN-list values (1, 20000) and we will pass them to the underlying index scan. We'll then generate the probe predicate "column = ?" as a start and stop key for the scan. At execution, then, the scan will first open up the index by using "1" as the start and stop key for the scan (or put another way, by plugging "1" into the probe predicate, "column = ?"). That will return all rows having ID equal to "1". Then when that scan ends, we'll reopen the scan a second time, this time using "20000" as the start and stop key, therefore returning all the 20000's. Meanwhile, all result sets higher up in the result set tree will just see a stream of rows from the index where ID only equals 1 or 20000. This works for IN-lists with parameters, as well, because by the time execution starts we know what values the parameters hold. The first draft of code for implementing all of this is pretty much done, but I plan to post it in increments for ease of review. If all goes well, nothing should change functionally until the final patch, which will be the preprocessing patch that does the actual creation of the "probe predicate". At that point all of the machinery should be in place to recognize the probe predicate and function accordingly. I ran Derby47PerformanceTest with this "multi-probing" approach and the numbers were similar to what I saw with the _v1 approach (more details coming later). Unlike the _v1 approach, though, the multi-probing approach works with subqueries, left outer joins, and scrollable updatable result sets. In addition, all of the test queries that I came up with involving multiple IN-lists run correctly with the multi-probing approach, and in many cases run quite a bit more quickly--neither of which was true for the _v1 approach. And yes, I do hope to add those test cases to the nightlies as part of my work on this issue. Incremental patches are forthcoming. In the meantime, if anyone has any comments/suggestions on the approach outlined above, I would appreciated the feedback...
          Hide
          A B added a comment -

          Posting d47_mp_relOpPredCheck_v1.patch, which is the first patch for the multi-probing ("mp") approach described in my previous comment. As mentioned in that comment, we need to be able to distinguish between "true" relational predicates and "probe predicates" so that we do not incorrectly perform certain operations on probe predicates. This first patch adds the logic to allow such distinction. In particular it:

          • Adds a new method, "isRelationalOpPredicate()", to Predicate.java that
            only returns true if the predicate is a "true" relational predicate; i.e.
            it will return "false" for probe predicates.
          • Updates several "if" statements in Predicate.java and PredicateList.java
            to use the new method.
          • Updates several utility methods in BinaryRelationalOperatorNode to distinguish
            "true" relational operators from ones that are created internally for probe
            predicates.

          There should be no functional changes to Derby as a result of this patch, but just to make sure I ran derbyall and suites.All on Red Hat Linux with ibm142. The only failure was a known issue (DERBY-2348).

          This is a pretty small patch so unless I hear otherwise I plan to commit it tomorrow (Friday Feb 23, PST).

          Show
          A B added a comment - Posting d47_mp_relOpPredCheck_v1.patch, which is the first patch for the multi-probing ("mp") approach described in my previous comment. As mentioned in that comment, we need to be able to distinguish between "true" relational predicates and "probe predicates" so that we do not incorrectly perform certain operations on probe predicates. This first patch adds the logic to allow such distinction. In particular it: Adds a new method, "isRelationalOpPredicate()", to Predicate.java that only returns true if the predicate is a "true" relational predicate; i.e. it will return "false" for probe predicates. Updates several "if" statements in Predicate.java and PredicateList.java to use the new method. Updates several utility methods in BinaryRelationalOperatorNode to distinguish "true" relational operators from ones that are created internally for probe predicates. There should be no functional changes to Derby as a result of this patch, but just to make sure I ran derbyall and suites.All on Red Hat Linux with ibm142. The only failure was a known issue ( DERBY-2348 ). This is a pretty small patch so unless I hear otherwise I plan to commit it tomorrow (Friday Feb 23, PST).
          Hide
          James Synge added a comment -

          Army, this sounds like great progress, thanks.

          Will this impact UPDATES? I want to ensure that each row selected by the following WHERE clause gets updated only once:

          UPDATE t SET someColumn = 1 - someColumn WHERE someColumn IN (0, 1)

          This will change all 1's to 0's and 0's to 1's in t.someColumn, but only if this is logically performed in a single pass over the table. This is of course something which all DBMS's have to take care with, and I think Derby was already correct in this area (I vaguely recall reading that somewhere).

          Performance may benefit from sorting the parameter values during execution, prior to the probing (e.g. if the index is large, and the number of parameters is also large). This sorting is done during compilation for literals.

          Show
          James Synge added a comment - Army, this sounds like great progress, thanks. Will this impact UPDATES? I want to ensure that each row selected by the following WHERE clause gets updated only once: UPDATE t SET someColumn = 1 - someColumn WHERE someColumn IN (0, 1) This will change all 1's to 0's and 0's to 1's in t.someColumn, but only if this is logically performed in a single pass over the table. This is of course something which all DBMS's have to take care with, and I think Derby was already correct in this area (I vaguely recall reading that somewhere). Performance may benefit from sorting the parameter values during execution, prior to the probing (e.g. if the index is large, and the number of parameters is also large). This sorting is done during compilation for literals.
          Hide
          A B added a comment -

          Thank you very much for feedback, James.

          > Will this impact UPDATES? I want to ensure that each row selected by the following WHERE
          > clause gets updated only once:
          >
          > UPDATE t SET someColumn = 1 - someColumn WHERE someColumn IN (0, 1)
          >
          > This will change all 1's to 0's and 0's to 1's in t.someColumn, but only if this is logically
          > performed in a single pass over the table.

          That's an excellent question; thank you for bringing it up. The best thing I could do to answer it was to try it out with the code in my local codeline. Does this example adequately reflect the scenario you are talking about? And are these the results you would expect? I think the answer to both of those questions is "Yes" but I just want to be safe.

          Here' s what I did in my local codeline. Note that lines beginning with "=" are debug lines that I have in my codeline to indicate that we are in fact doing "multi-probing" with x number of IN-list values.

          create table t (inew int, iold int);
          insert into t (iold) values 2, 1, -1, 0;
          insert into t (iold) values 2, 1, -1, 0;
          insert into t (iold) values 2, 1, -1, 0;
          insert into t (iold) values 2, 1, -1, 0;
          update t set inew = iold;

          create index t_ix1 on t(inew);

          ij> select * from t order by iold;
          INEW |IOLD
          -----------------------
          -1 |-1
          -1 |-1
          -1 |-1
          -1 |-1
          0 |0
          0 |0
          0 |0
          0 |0
          1 |1
          1 |1
          1 |1
          1 |1
          2 |2
          2 |2
          2 |2
          2 |2

          16 rows selected

          ij> update t set inew = 1 - inew where inew in (1, 0);
          = org.apache.derby.impl.sql.execute.TableScanResultSet.openCore w/ 2 vals.
          8 rows inserted/updated/deleted
          ij> select * from t order by iold;
          INEW |IOLD
          -----------------------
          -1 |-1
          -1 |-1
          -1 |-1
          -1 |-1
          1 |0
          1 |0
          1 |0
          1 |0
          0 |1
          0 |1
          0 |1
          0 |1
          2 |2
          2 |2
          2 |2
          2 |2

          16 rows selected

          ij> update t set inew = 1 - inew where inew in (1, 0);
          = org.apache.derby.impl.sql.execute.TableScanResultSet.openCore w/ 2 vals.
          8 rows inserted/updated/deleted
          ij> select * from t order by iold;
          INEW |IOLD
          -----------------------
          -1 |-1
          -1 |-1
          -1 |-1
          -1 |-1
          0 |0
          0 |0
          0 |0
          0 |0
          1 |1
          1 |1
          1 |1
          1 |1
          2 |2
          2 |2
          2 |2
          2 |2

          16 rows selected

          I believe that is the correct behavior, esp. since that's what I see in a clean codeline, as well.

          > Performance may benefit from sorting the parameter values during execution, prior to the probing .

          Yes, my changes include an execution time sort of the IN-list values (on the condition that the sort was not done during compilation). I have to admit, though, that I didn't make that decision for performance reasons; rather, I chose to sort the IN-list values to make it easier to detect (and skip over) duplicates in the IN-list...

          I greatly appreciate your feedback on this and hope you will continue ask any questions you might have. The more eyes, the better...

          Show
          A B added a comment - Thank you very much for feedback, James. > Will this impact UPDATES? I want to ensure that each row selected by the following WHERE > clause gets updated only once: > > UPDATE t SET someColumn = 1 - someColumn WHERE someColumn IN (0, 1) > > This will change all 1's to 0's and 0's to 1's in t.someColumn, but only if this is logically > performed in a single pass over the table. That's an excellent question; thank you for bringing it up. The best thing I could do to answer it was to try it out with the code in my local codeline. Does this example adequately reflect the scenario you are talking about? And are these the results you would expect? I think the answer to both of those questions is "Yes" but I just want to be safe. Here' s what I did in my local codeline. Note that lines beginning with " = " are debug lines that I have in my codeline to indicate that we are in fact doing "multi-probing" with x number of IN-list values. create table t (inew int, iold int); insert into t (iold) values 2, 1, -1, 0; insert into t (iold) values 2, 1, -1, 0; insert into t (iold) values 2, 1, -1, 0; insert into t (iold) values 2, 1, -1, 0; update t set inew = iold; create index t_ix1 on t(inew); ij> select * from t order by iold; INEW |IOLD ----------------------- -1 |-1 -1 |-1 -1 |-1 -1 |-1 0 |0 0 |0 0 |0 0 |0 1 |1 1 |1 1 |1 1 |1 2 |2 2 |2 2 |2 2 |2 16 rows selected ij> update t set inew = 1 - inew where inew in (1, 0); = org.apache.derby.impl.sql.execute.TableScanResultSet.openCore w/ 2 vals. 8 rows inserted/updated/deleted ij> select * from t order by iold; INEW |IOLD ----------------------- -1 |-1 -1 |-1 -1 |-1 -1 |-1 1 |0 1 |0 1 |0 1 |0 0 |1 0 |1 0 |1 0 |1 2 |2 2 |2 2 |2 2 |2 16 rows selected ij> update t set inew = 1 - inew where inew in (1, 0); = org.apache.derby.impl.sql.execute.TableScanResultSet.openCore w/ 2 vals. 8 rows inserted/updated/deleted ij> select * from t order by iold; INEW |IOLD ----------------------- -1 |-1 -1 |-1 -1 |-1 -1 |-1 0 |0 0 |0 0 |0 0 |0 1 |1 1 |1 1 |1 1 |1 2 |2 2 |2 2 |2 2 |2 16 rows selected I believe that is the correct behavior, esp. since that's what I see in a clean codeline, as well. > Performance may benefit from sorting the parameter values during execution, prior to the probing . Yes, my changes include an execution time sort of the IN-list values (on the condition that the sort was not done during compilation). I have to admit, though, that I didn't make that decision for performance reasons; rather, I chose to sort the IN-list values to make it easier to detect (and skip over) duplicates in the IN-list... I greatly appreciate your feedback on this and hope you will continue ask any questions you might have. The more eyes, the better...
          Hide
          Mike Matrigali added a comment -

          The multiple probe approach always seemed the most natural to me - as the multiple probe support was already there
          and used by the execution engine (for different reasons) already. It is great that it looks like you have found an elegant way
          to get the optimizer to cost the approach and throw out cases that approach does not support. Part of that was that I understand the mechanics of the multiple probe and didn't understand the semantics of the query rewrite.

          Can you say something
          why you chose to use x = ? predicate with special flag vs. just having a new multiple-probe inlist predicate (this question
          may not make sense - I am talking from the outside of your description not from knowledge of internals).

          Also what happens to a query that is effectively an IN list that is hand written using OR's instead (ie, where i = 1 or i = 2 or ...).
          Is that already changed to an IN list before we get to your new code here?

          This is probably food for a different discussion, but I was wondering about the costing. What is the costing % number of
          rows for a where IN (ie. a parameter at compile time vs a constant, in a non-unique index)? Is this just the cardinality
          statistic if it exists? What is the default without the statistic? Where I am going is that it probably does not make sense
          to have the estimate of the sum of terms be larger than the number of rows in the db. And just want to understand how many
          terms will it take before we give up on the multiple probe.

          Show
          Mike Matrigali added a comment - The multiple probe approach always seemed the most natural to me - as the multiple probe support was already there and used by the execution engine (for different reasons) already. It is great that it looks like you have found an elegant way to get the optimizer to cost the approach and throw out cases that approach does not support. Part of that was that I understand the mechanics of the multiple probe and didn't understand the semantics of the query rewrite. Can you say something why you chose to use x = ? predicate with special flag vs. just having a new multiple-probe inlist predicate (this question may not make sense - I am talking from the outside of your description not from knowledge of internals). Also what happens to a query that is effectively an IN list that is hand written using OR's instead (ie, where i = 1 or i = 2 or ...). Is that already changed to an IN list before we get to your new code here? This is probably food for a different discussion, but I was wondering about the costing. What is the costing % number of rows for a where IN (ie. a parameter at compile time vs a constant, in a non-unique index)? Is this just the cardinality statistic if it exists? What is the default without the statistic? Where I am going is that it probably does not make sense to have the estimate of the sum of terms be larger than the number of rows in the db. And just want to understand how many terms will it take before we give up on the multiple probe.
          Hide
          James Synge added a comment -

          Army, yes that's the behavior I was hoping to see. Thanks for double checking.

          Show
          James Synge added a comment - Army, yes that's the behavior I was hoping to see. Thanks for double checking.
          Hide
          A B added a comment -

          Thank you very much for your excellent questions, Mike. My attempted answers are below...

          > Can you say something why you chose to use x = ? predicate with special flag vs. just having a new
          > multiple-probe inlist predicate

          Good question. I guess the short answer is simply: code reuse. All of the optimization, modification, generation, and execution-time logic for a single-sided predicate is already written and has (presumably) been working for years. Among other things this includes the notion of "start/stop" keys to (re-)position an index scan, which is ultimately what we want and is something that store already knows about. By using a flag we can slightly alter the behavior at key points of certain methods and then, for everything else, we just let Derby do what it already knows how to do. Minimal code changes are required and if something breaks, odds are that it is in the "slightly altered" behavior (or lack thereof), of which there is far less than "everything else".

          If anyone knows of how we could improve/simplify the logic and/or performance by creating a new multi-probe predicate then I am certainly open to investigating that path further. But for now it seemed like the creation of "x = ?" with a flag was the simplest and quickest way to go, and it seems to provide the desired results. So that's where I ended up...

          > Also what happens to a query that is effectively an IN list that is hand written using OR's instead
          > (ie, where i = 1 or i = 2 or ...). Is that already changed to an IN list before we get to your new
          > code here?

          Yes, transformation of ORs into IN-lists occurs during preprocessing of the OR list. In OrNode.preprocess() there is logic to recognize if an OR list is transformable into an IN-list and, if so, the IN-list is created and then the "preprocess()" method of the IN-list is called. Since the creation of "probe predicates" occurs as part of IN-list preprocessing, this means that Yes, ORs are already converted to an IN-list before my new code takes effect.

          As a side note, if there is an OR clause which itself has an IN-list as one of its operands then OrNode preprocessing will, with my proposed changes, combine the existing IN-list with the newly-created IN-list. For example:

          select i, c from t1 where i in (1, 3, 5, 6) or i = 2 or i = 4

          will be changed to:

          select i, c from t1 where i in (1, 3, 5, 6, 2, 4)

          This conversion will happen as part of OrNode.preprocess(), as well.

          > What is the costing % number of rows for a where IN (ie. a parameter at compile time vs a
          > constant, in a non-unique index)? Is this just the cardinality statistic if it exists?

          Generally speaking the way costing for a base table conglomerate works is that we figure out how many rows there are in the table before any predicates are applied. Then, if we have a start/stop predicate and we have statistics, we will calculate a percentage of the rows expected (called "start/stop selectivity") based on the statistics. This ultimately brings us to the "selectivity(Object[]) method of StatisticsImpl, where there is the following code:

          if (numRows == 0.0)
          return 0.1;

          return (double)(1/(double)numUnique);

          I.e. the selectivity is 1 over the number of unique values in the conglomerate. Is this what you mean by "just the cardinality statistic if it exists?"

          In any event we then multiply that percentage by the estimated row count to get a final estimated row count (I'm leaving out lots of "magic" costing operations here to keep things simple (and because I don't really understand all of that magic myself...)).

          > What is the default without the statistic?

          If we do not have statistics for a specific conglomerate then we will simply default the start/stop selectivity to 1.0, i.e. the row count will not be adjusted (at least not as relates to this discussion).

          > Where I am going is that it probably does not make sense to have the estimate of the sum of terms
          > be larger than the number of rows in the db.

          Yes, you're absolutely right. This actually occurred to me yesterday, which is why I was poking around the stats code and thus was able to answer your previous question I agree that the estimated row count should not exceed the total number of rows. I think we could just account for this by adding an explicit check to see if rowCount * sizeOfInList yields a number larger than the number of rows in the conglomerate. If so then we set it to the number of rows in the conglomerate and that's that.

          > And just want to understand how many terms will it take before we give up on the multiple probe.

          Another great question. The answer is that we do not ever give up on multi-probing as part of "costing" per se. Rather, we calculate a cost and then we compare that cost with all of the other costs found so far; if it's cheaper we use it, otherwise we discard it. Note that "cheaper" here encapsulates a lot of other logic and optimizer info that is far beyond the scope of this discussion.

          So in the context of row counts, if the number of IN-list predicates multiplied by the estimated row count (after stat selectivity is applied) yields a high precentage row count (ex. all rows in the table) then the odds of the optimizer choosing to use that particular index are lower. It may still choose to use the index, in which case multi-probing will take effect, but it probably will not (it all depends). Thus the point at which we give up on multi-probing is a factor of how unique the column values are and how many values are in the IN-list. If you're just looking at the size of IN-list, then smaller lists are more likely to result in IN-list probing than larger ones--which I think is what we would expect.

          That's a bit of a vague answer but so much of it depends on the query and the data in question that I wouldn't want to say anything more specific than that...

          Show
          A B added a comment - Thank you very much for your excellent questions, Mike. My attempted answers are below... > Can you say something why you chose to use x = ? predicate with special flag vs. just having a new > multiple-probe inlist predicate Good question. I guess the short answer is simply: code reuse. All of the optimization, modification, generation, and execution-time logic for a single-sided predicate is already written and has (presumably) been working for years. Among other things this includes the notion of "start/stop" keys to (re-)position an index scan, which is ultimately what we want and is something that store already knows about. By using a flag we can slightly alter the behavior at key points of certain methods and then, for everything else, we just let Derby do what it already knows how to do. Minimal code changes are required and if something breaks, odds are that it is in the "slightly altered" behavior (or lack thereof), of which there is far less than "everything else". If anyone knows of how we could improve/simplify the logic and/or performance by creating a new multi-probe predicate then I am certainly open to investigating that path further. But for now it seemed like the creation of "x = ?" with a flag was the simplest and quickest way to go, and it seems to provide the desired results. So that's where I ended up... > Also what happens to a query that is effectively an IN list that is hand written using OR's instead > (ie, where i = 1 or i = 2 or ...). Is that already changed to an IN list before we get to your new > code here? Yes, transformation of ORs into IN-lists occurs during preprocessing of the OR list. In OrNode.preprocess() there is logic to recognize if an OR list is transformable into an IN-list and, if so, the IN-list is created and then the "preprocess()" method of the IN-list is called. Since the creation of "probe predicates" occurs as part of IN-list preprocessing, this means that Yes, ORs are already converted to an IN-list before my new code takes effect. As a side note, if there is an OR clause which itself has an IN-list as one of its operands then OrNode preprocessing will, with my proposed changes, combine the existing IN-list with the newly-created IN-list. For example: select i, c from t1 where i in (1, 3, 5, 6) or i = 2 or i = 4 will be changed to: select i, c from t1 where i in (1, 3, 5, 6, 2, 4) This conversion will happen as part of OrNode.preprocess(), as well. > What is the costing % number of rows for a where IN (ie. a parameter at compile time vs a > constant, in a non-unique index)? Is this just the cardinality statistic if it exists? Generally speaking the way costing for a base table conglomerate works is that we figure out how many rows there are in the table before any predicates are applied. Then, if we have a start/stop predicate and we have statistics, we will calculate a percentage of the rows expected (called "start/stop selectivity") based on the statistics. This ultimately brings us to the "selectivity(Object[]) method of StatisticsImpl, where there is the following code: if (numRows == 0.0) return 0.1; return (double)(1/(double)numUnique); I.e. the selectivity is 1 over the number of unique values in the conglomerate. Is this what you mean by "just the cardinality statistic if it exists?" In any event we then multiply that percentage by the estimated row count to get a final estimated row count (I'm leaving out lots of "magic" costing operations here to keep things simple (and because I don't really understand all of that magic myself...)). > What is the default without the statistic? If we do not have statistics for a specific conglomerate then we will simply default the start/stop selectivity to 1.0, i.e. the row count will not be adjusted (at least not as relates to this discussion). > Where I am going is that it probably does not make sense to have the estimate of the sum of terms > be larger than the number of rows in the db. Yes, you're absolutely right. This actually occurred to me yesterday, which is why I was poking around the stats code and thus was able to answer your previous question I agree that the estimated row count should not exceed the total number of rows. I think we could just account for this by adding an explicit check to see if rowCount * sizeOfInList yields a number larger than the number of rows in the conglomerate. If so then we set it to the number of rows in the conglomerate and that's that. > And just want to understand how many terms will it take before we give up on the multiple probe. Another great question. The answer is that we do not ever give up on multi-probing as part of "costing" per se. Rather, we calculate a cost and then we compare that cost with all of the other costs found so far; if it's cheaper we use it, otherwise we discard it. Note that "cheaper" here encapsulates a lot of other logic and optimizer info that is far beyond the scope of this discussion. So in the context of row counts, if the number of IN-list predicates multiplied by the estimated row count (after stat selectivity is applied) yields a high precentage row count (ex. all rows in the table) then the odds of the optimizer choosing to use that particular index are lower. It may still choose to use the index, in which case multi-probing will take effect, but it probably will not (it all depends). Thus the point at which we give up on multi-probing is a factor of how unique the column values are and how many values are in the IN-list. If you're just looking at the size of IN-list, then smaller lists are more likely to result in IN-list probing than larger ones--which I think is what we would expect. That's a bit of a vague answer but so much of it depends on the query and the data in question that I wouldn't want to say anything more specific than that...
          Hide
          A B added a comment -

          Committed d47_relOpPredCheck_v1.patch with svn 512079 after getting the okay from Mike to do so (on derby-dev):

          http://svn.apache.org/viewvc?view=rev&rev=512079

          Attaching the second incremental patch, d47_mp_CBO_MoAP_v1.patch, which updates the logic for cost-based optimization (CBO) and modification of access paths (MoAP) to recognize IN-list "probe predicates" and to handle them appropriately. More specifically this patch adds code to do the following:

          • During costing, recognize when we're using a probe predicate as a start/stop key
            and adjust the cost accordingly. This means multiplying the estimated cost and
            row count for "column = ?" by the number of values in the IN-list (because we are
            effectively going to evaluate "column = ?" N times, where N is the size of the
            IN-list, and we could return one or more rows for each of the N evaluations).
            As mentioned in Mike's comment above, we also want to make sure that the resultant
            row count estimate is not greater than the total number of rows in the table.
          • When determining which predicates can be used as start/stop keys for the current
            conglomerate, only consider a probe predicate to be a start/stop key if it applies
            to the first column in the conglomerate. Otherwise the probe predicate would
            end up being generated as a store qualifier, which means we would only get rows
            for which "column = ?" was true when the parameter was set to the first value
            in the IN-list. That means we would end up with incorrect results (missing
            rows).
          • If cost-based optimization is complete and we are modifying access paths in
            preparation for code generation, then take any probe predicates that are not
            going to be used as start/stop keys for the chosen conglomerate and "revert" them
            back to their original IN-list form (i.e. to the InListOperatorNodes from which
            they were built). Those InListOpNodes will then be generated as normal IN-list
            restrictions on the rows returned from store. If we did not do this reverting
            then the predicates would ultimately be ignored (since they are not valid
            qualifiers) and we would therefore end up with incorrect results (extra rows).
          • If we're modifying access paths and we have chosen to do multi-probing of an index
            then we disable bulk fetching for the target base table. Logically this is not a
            requirement. However, it turns out that bulk fetch can lead to poor performance
            when multi-probing an index if the number of probe values is high (several hundred
            or more) BUT that number is still just a small fraction of the total number of rows
            in the table. An example of such a scenario is found in the Derby47PerformanceTest
            program attached to this issue. If the total number of rows in the ADMIN.CHANGES
            table is 100,000 and there are 200 or more parameter markers in the IN-list, the
            performance of multi-probing with bulk fetch enabled is just as bad as a table
            scan--and actually gets worse as the number of parameters grows.

          I cannot say with any certainty why bulk fetching performs so badly in this situation. My guess (and it's just a guess) is that when we bulk fetch we end up reading a unnecessary pages from disk. My (perhaps faulty) thinking is that for each probe we do of the index our start and stop keys are going to be the same value. That means that we are probably going to be returning at most a handful of rows (more likely just a row or two). But perhaps bulk fetching is somehow causing us to read more pages from disk than we need and the result is a slowdown in performance?

          Does anyone know if that actually makes any sense? I could be completely wrong here so I'd appreciate any correction.

          All of that said, I found that if I disable bulk fetch for multi-probing the performance returns to what I would expect (matching and even beating the "Marker" strategy posted by James), so that's what d47_mp_CBO_MoAP_v1.patch does. At the very least I'm hoping this is an acceptable step in the right direction.

          As with my previous patch, this CBO_MoAP patch should not change any existing functionality because all of the new behavior depends on the existence of "probe predicates", which do not yet exist.

          Review comments are much appreciated (esp. w.r.t the bulk fetching changes)...

          Show
          A B added a comment - Committed d47_relOpPredCheck_v1.patch with svn 512079 after getting the okay from Mike to do so (on derby-dev): http://svn.apache.org/viewvc?view=rev&rev=512079 Attaching the second incremental patch, d47_mp_CBO_MoAP_v1.patch, which updates the logic for cost-based optimization (CBO) and modification of access paths (MoAP) to recognize IN-list "probe predicates" and to handle them appropriately. More specifically this patch adds code to do the following: During costing, recognize when we're using a probe predicate as a start/stop key and adjust the cost accordingly. This means multiplying the estimated cost and row count for "column = ?" by the number of values in the IN-list (because we are effectively going to evaluate "column = ?" N times, where N is the size of the IN-list, and we could return one or more rows for each of the N evaluations). As mentioned in Mike's comment above, we also want to make sure that the resultant row count estimate is not greater than the total number of rows in the table. When determining which predicates can be used as start/stop keys for the current conglomerate, only consider a probe predicate to be a start/stop key if it applies to the first column in the conglomerate. Otherwise the probe predicate would end up being generated as a store qualifier, which means we would only get rows for which "column = ?" was true when the parameter was set to the first value in the IN-list. That means we would end up with incorrect results (missing rows). If cost-based optimization is complete and we are modifying access paths in preparation for code generation, then take any probe predicates that are not going to be used as start/stop keys for the chosen conglomerate and "revert" them back to their original IN-list form (i.e. to the InListOperatorNodes from which they were built). Those InListOpNodes will then be generated as normal IN-list restrictions on the rows returned from store. If we did not do this reverting then the predicates would ultimately be ignored (since they are not valid qualifiers) and we would therefore end up with incorrect results (extra rows). If we're modifying access paths and we have chosen to do multi-probing of an index then we disable bulk fetching for the target base table. Logically this is not a requirement. However, it turns out that bulk fetch can lead to poor performance when multi-probing an index if the number of probe values is high (several hundred or more) BUT that number is still just a small fraction of the total number of rows in the table. An example of such a scenario is found in the Derby47PerformanceTest program attached to this issue. If the total number of rows in the ADMIN.CHANGES table is 100,000 and there are 200 or more parameter markers in the IN-list, the performance of multi-probing with bulk fetch enabled is just as bad as a table scan--and actually gets worse as the number of parameters grows. I cannot say with any certainty why bulk fetching performs so badly in this situation. My guess (and it's just a guess) is that when we bulk fetch we end up reading a unnecessary pages from disk. My (perhaps faulty) thinking is that for each probe we do of the index our start and stop keys are going to be the same value. That means that we are probably going to be returning at most a handful of rows (more likely just a row or two). But perhaps bulk fetching is somehow causing us to read more pages from disk than we need and the result is a slowdown in performance? Does anyone know if that actually makes any sense? I could be completely wrong here so I'd appreciate any correction. All of that said, I found that if I disable bulk fetch for multi-probing the performance returns to what I would expect (matching and even beating the "Marker" strategy posted by James), so that's what d47_mp_CBO_MoAP_v1.patch does. At the very least I'm hoping this is an acceptable step in the right direction. As with my previous patch, this CBO_MoAP patch should not change any existing functionality because all of the new behavior depends on the existence of "probe predicates", which do not yet exist. Review comments are much appreciated (esp. w.r.t the bulk fetching changes)...
          Hide
          A B added a comment -

          Attaching a patch, d47_mp_addlTestCases.patch, which adds some additional IN-list test cases to the lang/inbetween.sql test. These test cases all currently behave correctly; by adding them to inbetween.sql we can ensure that they will continue to behave correctly once the DERBY-47 changes have been completed.

          The underlying notion here is to make sure IN list behavior is correct when the left operand is a column reference that is a leading column in one or more indexes. The DERBY-47 changes will ultimately make it so that most of the new test cases result in an index-probing execution plan, thus we want to make sure that we're testing as many of the various index-based use cases as possible.

          Note that these test cases are just testing correctness of results; additional tests will be added later to verify that indexes are in fact being chosen as a result of the DERBY-47 changes.

          Patch committed with svn #512534:

          http://svn.apache.org/viewvc?view=rev&rev=512534

          Show
          A B added a comment - Attaching a patch, d47_mp_addlTestCases.patch, which adds some additional IN-list test cases to the lang/inbetween.sql test. These test cases all currently behave correctly; by adding them to inbetween.sql we can ensure that they will continue to behave correctly once the DERBY-47 changes have been completed. The underlying notion here is to make sure IN list behavior is correct when the left operand is a column reference that is a leading column in one or more indexes. The DERBY-47 changes will ultimately make it so that most of the new test cases result in an index-probing execution plan, thus we want to make sure that we're testing as many of the various index-based use cases as possible. Note that these test cases are just testing correctness of results; additional tests will be added later to verify that indexes are in fact being chosen as a result of the DERBY-47 changes. Patch committed with svn #512534: http://svn.apache.org/viewvc?view=rev&rev=512534
          Hide
          Bryan Pendleton added a comment -

          Something about this SQL statement just tickled my fancy and made me smile!

          update bt1 set de = cast (i/2.8 as decimal(4,1)) where i >= 10 and 2 * (cast (i as double) / 2.0) - (i / 2) = i / 2;

          Show
          Bryan Pendleton added a comment - Something about this SQL statement just tickled my fancy and made me smile! update bt1 set de = cast (i/2.8 as decimal(4,1)) where i >= 10 and 2 * (cast (i as double) / 2.0) - (i / 2) = i / 2;
          Hide
          A B added a comment -

          Committed d47_mp_CBO_MoAP_v1.patch with svn 513839:

          URL: http://svn.apache.org/viewvc?view=rev&rev=513839

          Show
          A B added a comment - Committed d47_mp_CBO_MoAP_v1.patch with svn 513839: URL: http://svn.apache.org/viewvc?view=rev&rev=513839
          Hide
          A B added a comment -

          Attaching d47_mp_codeGen_v1.patch, which updates Derby code generation to account for the potential presence of IN-list probe predicates. This patch does the following:

          1 - Moves the code for generating a list of IN values into a new method, InListOperatorNode.generateListAsArray()". The new method is then called from two places:

          A. InListOperatorNode.generateExpression(): the "normal" code-path for
          generating IN-list bytecode (prior to DERBY-47 changes).

          B. PredicateList.generateInListValues(): new method for generating the IN-list
          values that will serve as the execution-time index "probe" values. This
          method also generates a boolean to indicate whether or not the values
          are already sorted (i.e. if we sorted them at compile time, which means
          they all must have been constants).

          2 - Adds code to ParameterNode that allows generation of a "place-holder" value (instead of the ParameterNode itself) for probe-predicates. This is required because a probe predicate has the form "column = ?" where the right operand is an internally generated parameter node that does not actually correspond to a user parameter. Since that parameter node is "fake" we can't really generate it; instead we need to be able to generate a legitimate ValueNode-either a constant node or a "real" parameter node-to serve as the place-holder. The codeGen patch makes that possible.

          3 - Updates the generateExpression() method of BinaryOperatorNode to account for situations where the optimizer chooses a plan for which a probe predicate is not a useful start/stop key and thus is not being used for execution-time index probing. In this case we simply "revert" the probe predicate back to the InListOperatorNode from which it was created. Or put another way, we "give up" on index multi-probing and simply generate the original IN-list as a regular restriction.

          In creating this patch I realized that having the "revert" code in BinaryOperatorNode.generateExpression() is a "catch-all" for any probe predicates that are not "useful" for the final access path. So by doing the "revert" operation at code generation time we remove the need for the explicit "revertToSourceInList()" calls that I added to "modification of access paths" code in the previous patch (d47_CBO_MoAP). Since I could not see any benefit to reverting during MoAP vs. reverting at code gen time, I opted to go with the latter. So this patch also removes the now unnecessary "revertToSourceInList()" calls from PredicateList.java.

          4 - Adds logic to NestedLoopJoinStrategy to generate a new type of result set, MultiProbeTableScanResultSet, for probing an index at execution time. The new result set does not yet exist (incremental development) but the code to generate such a result set is added as part of this patch. Note that we should never choose to do "multi-probing" for a hash join; comments explaining why are in the patch, along with a sanity assertion to catch any cases for which that might incorrectly happen.

          5 - Adds a new method, "getMultiProbeTableScanResultSet()", to the ResultSetFactory interface. Also adds a corresponding stub method to GenericResultSetFactory. The latter is just a dummy method and will be filled in with the appropriate code as part of a subsequent patch.

          I ran derbyall and suites.All on Red Hat Linux with ibm142 and there were no new failures. Reviews are appreciated, as always. If I hear no objections I will commit this patch in a couple of days.

          Show
          A B added a comment - Attaching d47_mp_codeGen_v1.patch, which updates Derby code generation to account for the potential presence of IN-list probe predicates. This patch does the following: 1 - Moves the code for generating a list of IN values into a new method, InListOperatorNode.generateListAsArray()". The new method is then called from two places: A. InListOperatorNode.generateExpression(): the "normal" code-path for generating IN-list bytecode (prior to DERBY-47 changes). B. PredicateList.generateInListValues(): new method for generating the IN-list values that will serve as the execution-time index "probe" values. This method also generates a boolean to indicate whether or not the values are already sorted (i.e. if we sorted them at compile time, which means they all must have been constants). 2 - Adds code to ParameterNode that allows generation of a "place-holder" value (instead of the ParameterNode itself) for probe-predicates. This is required because a probe predicate has the form "column = ?" where the right operand is an internally generated parameter node that does not actually correspond to a user parameter. Since that parameter node is "fake" we can't really generate it; instead we need to be able to generate a legitimate ValueNode- either a constant node or a "real" parameter node -to serve as the place-holder. The codeGen patch makes that possible. 3 - Updates the generateExpression() method of BinaryOperatorNode to account for situations where the optimizer chooses a plan for which a probe predicate is not a useful start/stop key and thus is not being used for execution-time index probing. In this case we simply "revert" the probe predicate back to the InListOperatorNode from which it was created. Or put another way, we "give up" on index multi-probing and simply generate the original IN-list as a regular restriction. In creating this patch I realized that having the "revert" code in BinaryOperatorNode.generateExpression() is a "catch-all" for any probe predicates that are not "useful" for the final access path. So by doing the "revert" operation at code generation time we remove the need for the explicit "revertToSourceInList()" calls that I added to "modification of access paths" code in the previous patch (d47_CBO_MoAP). Since I could not see any benefit to reverting during MoAP vs. reverting at code gen time, I opted to go with the latter. So this patch also removes the now unnecessary "revertToSourceInList()" calls from PredicateList.java. 4 - Adds logic to NestedLoopJoinStrategy to generate a new type of result set, MultiProbeTableScanResultSet, for probing an index at execution time. The new result set does not yet exist (incremental development) but the code to generate such a result set is added as part of this patch. Note that we should never choose to do "multi-probing" for a hash join; comments explaining why are in the patch, along with a sanity assertion to catch any cases for which that might incorrectly happen. 5 - Adds a new method, "getMultiProbeTableScanResultSet()", to the ResultSetFactory interface. Also adds a corresponding stub method to GenericResultSetFactory. The latter is just a dummy method and will be filled in with the appropriate code as part of a subsequent patch. I ran derbyall and suites.All on Red Hat Linux with ibm142 and there were no new failures. Reviews are appreciated, as always. If I hear no objections I will commit this patch in a couple of days.
          Hide
          Bryan Pendleton added a comment -

          Hi Army, just wanted to let you know that I've been reading your notes and reading the patches. If I have any concrete comments or suggestions, I'll barge in, but so far it's all looked quite good to me. I particularly want to thank you for taking the time to comment the code so clearly and thoroughly; it makes the world of difference to us code-readers.

          Show
          Bryan Pendleton added a comment - Hi Army, just wanted to let you know that I've been reading your notes and reading the patches. If I have any concrete comments or suggestions, I'll barge in, but so far it's all looked quite good to me. I particularly want to thank you for taking the time to comment the code so clearly and thoroughly; it makes the world of difference to us code-readers.
          Hide
          A B added a comment -

          Committed d47_mp_codeGen_v1.patch with svn # 515795:

          URL: http://svn.apache.org/viewvc?view=rev&rev=515795

          And now attaching d47_mp_exec_v1.patch, which is a patch to implement execution-time "probing" given a probe predicate "place-holder" and a list of IN values. This patch creates a new execution-time result, MuliProbeTableScanResultSet, to perform the probing. Generally speaking the process is as follows, where "probe list" (aka "probeValues") corresponds to the IN list in question.

          0 - Open a scan using the first value in the (sorted) probe list as a start AND stop key.

          Then for each call to "getNextRowCore()":

          1 - See if we have a row to read from the current scan position. If so, return that row (done).

          2 - If there are no more rows to read from the current scan position AND if there are more
          probe values to look at, then a) reopen the scan using the next probe value as the start/
          stop key and b) go back to step 1. Otherwise proceed to step 3.

          3 - Return null (no more rows).

          At a higher-level the changes in exec_v1.patch make it so that repeated calls to MultiProbeTableScanResultSet.getNextRowCore() will first return all rows matching probeValues[0], then all rows matching probeValues[1], and so on (duplicate probe values are ignored). Once all matching rows for all values in probeValues have been returned, the call to getNextRowCore() will return null, thereby ending the scan.

          In order to accommodate the above behavior, the following changes were made to existing files:

          1 - Add correct instantiation logic to the "getMultiProbeTableScanResultSet()"
          method of GenericResultSetFactory, which was just a stub method before this
          patch.

          2 - Overloaded methods in TableScanResultSet to allow the passing of a "probe value"
          into the openScanController() and reopenScanController() methods. The methods
          then use the probe value (if one exists) as the start/stop key for positioning
          a scan, instead of using the start/stop key passed into the result set constructor.

          3 - Made the iapi.types.DataType class implement the java.lang.Comparable interface
          for the sake of easy sorting (just let the JVM do the sort). Since DataType (the
          superclass of all datatypes and base implementation of the DataValueDescriptor
          interface) already has a "compare()" method that returns an integer to indicate
          less than, greater than, or equal, all we have to do is wrap that method inside
          a "compareTo()" method and we're done.

          There are two issues worth mentioning regarding this sort. First, the compareTo()
          method does not throw any exceptions, so if an error occurs while trying to compare
          two DataValueDescriptors, we will simply treat the values as "equal" when running
          in insane mode (in sane mode we will throw an assertion failure). Is this
          acceptable? If not, is there a better way to handle this, aside from writing my
          own sorting code? (which is doable but seems like overkill).

          Second, for some strange reason sorting the probeValues array directly (i.e.
          in-place sort) leads to incorrect parameter value assignment when executing a
          prepared statement multiple times. I was unable to figure out why that might
          be (maybe related to DERBY-827?). To get around the problem I create clones
          of the IN values and then sort the clones. That solves the problem but has
          the obvious drawback of extra memory requirements. I'm hoping that for now
          this is an okay workaround (progress, not perfection), but if anyone has any
          ideas as to what could be going on here, I'd appreciate the input.

          And of course, if there are any other reasons why it's bad to make DataType
          implement the Comparable interface, I hope that reviewers can speak up. If
          it comes down to it I can always add a simple sort method to MultiProbeTSCRS
          and just use that.

          As with all preceding patches, this patch should not have any functional effect on Derby processing because the new behavior depends on probe predicates, which do not yet exist. I have not yet had a chance to run derbyall as a sanity check, but plan to do so before committing. In the meantime, questions/comments/feedback on exec_v1.patch as attached would be much appreciated.

          Show
          A B added a comment - Committed d47_mp_codeGen_v1.patch with svn # 515795: URL: http://svn.apache.org/viewvc?view=rev&rev=515795 And now attaching d47_mp_exec_v1.patch, which is a patch to implement execution-time "probing" given a probe predicate "place-holder" and a list of IN values. This patch creates a new execution-time result, MuliProbeTableScanResultSet, to perform the probing. Generally speaking the process is as follows, where "probe list" (aka "probeValues") corresponds to the IN list in question. 0 - Open a scan using the first value in the (sorted) probe list as a start AND stop key. Then for each call to "getNextRowCore()": 1 - See if we have a row to read from the current scan position. If so, return that row (done). 2 - If there are no more rows to read from the current scan position AND if there are more probe values to look at, then a) reopen the scan using the next probe value as the start/ stop key and b) go back to step 1. Otherwise proceed to step 3. 3 - Return null (no more rows). At a higher-level the changes in exec_v1.patch make it so that repeated calls to MultiProbeTableScanResultSet.getNextRowCore() will first return all rows matching probeValues [0] , then all rows matching probeValues [1] , and so on (duplicate probe values are ignored). Once all matching rows for all values in probeValues have been returned, the call to getNextRowCore() will return null, thereby ending the scan. In order to accommodate the above behavior, the following changes were made to existing files: 1 - Add correct instantiation logic to the "getMultiProbeTableScanResultSet()" method of GenericResultSetFactory, which was just a stub method before this patch. 2 - Overloaded methods in TableScanResultSet to allow the passing of a "probe value" into the openScanController() and reopenScanController() methods. The methods then use the probe value (if one exists) as the start/stop key for positioning a scan, instead of using the start/stop key passed into the result set constructor. 3 - Made the iapi.types.DataType class implement the java.lang.Comparable interface for the sake of easy sorting (just let the JVM do the sort). Since DataType (the superclass of all datatypes and base implementation of the DataValueDescriptor interface) already has a "compare()" method that returns an integer to indicate less than, greater than, or equal, all we have to do is wrap that method inside a "compareTo()" method and we're done. There are two issues worth mentioning regarding this sort. First, the compareTo() method does not throw any exceptions, so if an error occurs while trying to compare two DataValueDescriptors, we will simply treat the values as "equal" when running in insane mode (in sane mode we will throw an assertion failure). Is this acceptable? If not, is there a better way to handle this, aside from writing my own sorting code? (which is doable but seems like overkill). Second, for some strange reason sorting the probeValues array directly (i.e. in-place sort) leads to incorrect parameter value assignment when executing a prepared statement multiple times. I was unable to figure out why that might be (maybe related to DERBY-827 ?). To get around the problem I create clones of the IN values and then sort the clones. That solves the problem but has the obvious drawback of extra memory requirements. I'm hoping that for now this is an okay workaround (progress, not perfection), but if anyone has any ideas as to what could be going on here, I'd appreciate the input. And of course, if there are any other reasons why it's bad to make DataType implement the Comparable interface, I hope that reviewers can speak up. If it comes down to it I can always add a simple sort method to MultiProbeTSCRS and just use that. As with all preceding patches, this patch should not have any functional effect on Derby processing because the new behavior depends on probe predicates, which do not yet exist. I have not yet had a chance to run derbyall as a sanity check, but plan to do so before committing. In the meantime, questions/comments/feedback on exec_v1.patch as attached would be much appreciated.
          Hide
          Bryan Pendleton added a comment -

          I was reading through d47_mp_codeGen_v1.patch, and the
          new getMultiProbeTableScanResultSet() method struck me as
          somewhat awkward in having 26 arguments to the method. I see
          you haven't actually implemented this method yet (or maybe you have;
          I'm a patch-or-two behind ), so maybe it's possible to explore why
          we ended up with 26 arguments here?

          Show
          Bryan Pendleton added a comment - I was reading through d47_mp_codeGen_v1.patch, and the new getMultiProbeTableScanResultSet() method struck me as somewhat awkward in having 26 arguments to the method. I see you haven't actually implemented this method yet (or maybe you have; I'm a patch-or-two behind ), so maybe it's possible to explore why we ended up with 26 arguments here?
          Hide
          A B added a comment -

          > maybe it's possible to explore why we ended up with 26 arguments here?

          Good question, Bryan. The relevant code in codeGen_v1.patch is as follows:

          + /* If we're going to generate a list of IN-values for index probing
          + * at execution time then we push TableScanResultSet arguments plus
          + * two additional arguments: 1) the list of IN-list values, and 2)
          + * a boolean indicating whether or not the IN-list values are already
          + * sorted.
          + */
          + if (genInListVals)
          {
          + numArgs = 26;

          What that comment does not say is that the reason we use TableScanresultSet arguments plus 2 is that the new result set, MultiProbeTableScanResultSet, extends TableScanResultSet and depends on TableScanResultSet to do most of the work. Therefore we need all of the usual (24) arguments for TableScanResultSet , plus two additional arguments for logic specific to multi-probing. The latest patch, d47_mp_exec_v1.patch, includes the new MultiProbeTableScanResultSet class, which hopefully shows how things are expected to work.

          > (or maybe you have; I'm a patch-or-two behind ),

          Oops! My apologies. For some reason I took your previous comment to mean that you had already looked at the various patches up to and including codeGen_v1, and that you didn't have any suggestions. I see now that you were (perhaps?) indicating that you were in the process of looking at the patches but were not yet done.

          Sorry for rushing on this one. I'll let the exec_v1 patch sit for a while (at least until Monday) to give you (and any other developers who may be in a similar situation) ample review time. I'll post again before I commit and if you have any feedback or else would like a few more days, feel free to say so.

          I appreciate you looking at these patches and didn't mean to hurry or otherwise overlook your comments. Take all the time you need, and please continue to ask any questions you may have.

          Thanks again for your time!

          Show
          A B added a comment - > maybe it's possible to explore why we ended up with 26 arguments here? Good question, Bryan. The relevant code in codeGen_v1.patch is as follows: + /* If we're going to generate a list of IN-values for index probing + * at execution time then we push TableScanResultSet arguments plus + * two additional arguments: 1) the list of IN-list values, and 2) + * a boolean indicating whether or not the IN-list values are already + * sorted. + */ + if (genInListVals) { + numArgs = 26; What that comment does not say is that the reason we use TableScanresultSet arguments plus 2 is that the new result set, MultiProbeTableScanResultSet, extends TableScanResultSet and depends on TableScanResultSet to do most of the work. Therefore we need all of the usual (24) arguments for TableScanResultSet , plus two additional arguments for logic specific to multi-probing. The latest patch, d47_mp_exec_v1.patch, includes the new MultiProbeTableScanResultSet class, which hopefully shows how things are expected to work. > (or maybe you have; I'm a patch-or-two behind ), Oops! My apologies. For some reason I took your previous comment to mean that you had already looked at the various patches up to and including codeGen_v1, and that you didn't have any suggestions. I see now that you were (perhaps?) indicating that you were in the process of looking at the patches but were not yet done. Sorry for rushing on this one. I'll let the exec_v1 patch sit for a while (at least until Monday) to give you (and any other developers who may be in a similar situation) ample review time. I'll post again before I commit and if you have any feedback or else would like a few more days, feel free to say so. I appreciate you looking at these patches and didn't mean to hurry or otherwise overlook your comments. Take all the time you need, and please continue to ask any questions you may have. Thanks again for your time!
          Hide
          Bryan Pendleton added a comment -

          Thanks Army for the good explanation. I didn't realize that the calls to this 26-argument
          method are generated. That makes a bit difference; I was worried about the complexity
          of human beings writing and maintaining code which called the 26-argument method, but
          if it's only called by generated code it's a totally different story.

          I don't think you're rushing. Please keep on with the process as you've been doing it; if I
          come up with anything of substance we can take a look at it at that time.

          Show
          Bryan Pendleton added a comment - Thanks Army for the good explanation. I didn't realize that the calls to this 26-argument method are generated. That makes a bit difference; I was worried about the complexity of human beings writing and maintaining code which called the 26-argument method, but if it's only called by generated code it's a totally different story. I don't think you're rushing. Please keep on with the process as you've been doing it; if I come up with anything of substance we can take a look at it at that time.
          Hide
          Bryan Pendleton added a comment -

          Hi Army,

          I finally got around to reading through the patches. Sorry it took a while.
          I haven't actually run any of the code, just given the patches a close
          reading, so some of these questions could have been answered that way, but
          hopefully it's all still useful to you. Without further ado

          By the way, all of these are very minor things. These patches are excellent,
          and I learned a lot from reading them. This is great work!

          1) In the relOpPredCheck patch, you made this change to PredicateList.java:

          • if (relop == null || ! relop.isQualifier(optTable, false))
            + if (!pred.isRelationalOpPredicate() ||
            + !pred.getRelop().isQualifier(optTable, false))

          I'm concerned that this change may have a hole for a possible NPE. Is it
          possible that there could arise a case where we have an IN-list predicate
          at this point, such that relop is null but in-list is not null, and then
          pred.isRelationalOpPredicate() would return false, but pred.getRelop()
          would return null?

          2) I spent some time studying the code in PredicateList.java which manipulates
          IN list operator predicates, and got a bit twisted around. There seems to
          be a lot of code which looks, more or less, like the following:

          RelationalOperator relop = pred.getRelop();
          InListOperatorNode inNode = pred.getSourceInList();
          boolean isIn = (inNode != null);

          if (relop == null)
          {
          /* if it's "in" operator, we generate dynamic start and stop key

          • to improve index scan performance, beetle 3858.
            */
            if (pred.getAndNode().getLeftOperand() instanceof InListOperatorNode &&
            ! ((InListOperatorNode)pred.getAndNode().getLeftOperand()).getTransformed()) { isIn = true; inNode = (InListOperatorNode) pred.getAndNode().getLeftOperand(); }

          My naive reaction to code like this is "shouldn't this be handled in
          pred.getSourceInList()"?

          That is, it seems like there's a lot of code in PredicateList.java which
          calls pred.getSourceInList, but then does this other complicated "if" test
          to get the InListOperatorNode a different way.

          Can you help me understand why are there these two different ways to get
          the InListOperatorNode for a predicate (via getSourceInList() and via
          getAndNode.getLeftOperand() ), and what do you think about creating a
          "helper" function in Predicate.java, similar to getSourceInList, to clean up
          the 4 or 5 places in PredicateList.java where we have code like the above?

          3) The new method PredicateList.generateInListValues() seems to go to some
          pains to handle the case where it is called, but this predicate list
          actually doesn't have any in list values to generate. My reaction was:
          "why is this routine getting called in this case?" It seems like
          NestedLoopJoinStrategy never calls generateInListValues unless there are
          actually in-list values to generate, so why does generateInListValues
          need to handle the case where there is no InListOperatorNode?

          That is, does the final "else" clause in generateInListValues ever actually
          occur?

          4) The code which traverses predicate lists seems to always traverse them
          in backwards order, e.g. this code from HashJoinStrategy.java:

          for (int i = storeRestrictionList.size() - 1; i >= 0; i--)

          Why do we always traverse these backwards? Is this just an optimization
          in order to call the size() method only once? Or is there something
          deeper going on?

          5) In InListOperatorNode, I at first thought that you had put this code in,
          but then after more study I saw that this was just an artifact of the
          patch program and you hadn't introduced this code, but just re-arranged it
          a bit. Still, the code showed up with "+" signs in the patch so I looked
          at it, and feel compelled to comment on it.

          This is at about line 413 in InListOperatorNode.java:

          //LocalField receiverField =
          // acb.newFieldDeclaration(Modifier.PRIVATE, receiverType);

          leftOperand.generateExpression(acb, mb);
          mb.dup();
          //mb.putField(receiverField); // instance for method call
          /mb.getField(receiverField);/ mb.upCast(leftInterfaceType); // first arg

          Do you know what is going on here? What is this commented out code, and
          can we clean it up? The last line in particular is quite frustrating with
          its snippet of actual code sandwiched in between two comments on the line.

          6) I seem to recall some discussions on the mailing lists from people who
          had code generation systems which tended to generate horrific IN clauses
          with, for example, upwards of 1500 elements in the IN list. I think it would
          be nice to have a test or two which verified that we can actually compile
          and run such a SQL statement. You may have already done this, and I missed
          it, but most of the test cases in the addTestCases patch seemed to have
          IN list clauses with no more than 8-10 values in them.

          A test case like this, while hideously ugly, would help to ensure that
          we didn't have some problematic N^2 or N! handling of the IN list values
          somewhere that would explode if we had too many IN list values.

          7) Generally speaking, I like the new convenience methods in Predicate.java,
          but I do worry about whether we need to start being concerned about
          overall efficiency in the optimizer. For example, consider a change like:

          RelationalOperator relop = pred.getRelop();

          • if (relop != null)
            + if (pred.isRelationalOpPredicate())

          since isRelationalOpPredicate may end up calling getRelop() two more times,
          and since getRelop() calls getLeftOperand() twice and does an instanceof
          check, we're incrementally adding a fair amount of code.

          It's really important that the optimizer code be clear and easy to read,
          so I'm not suggesting any drastic measures. I'm just sort of thinking
          out loud about some concerns about how to write an optimizer which is
          simultaneously both

          • clear and easy to read and understand (this is the most important)
          • yet also efficient to run
          Show
          Bryan Pendleton added a comment - Hi Army, I finally got around to reading through the patches. Sorry it took a while. I haven't actually run any of the code, just given the patches a close reading, so some of these questions could have been answered that way, but hopefully it's all still useful to you. Without further ado By the way, all of these are very minor things. These patches are excellent, and I learned a lot from reading them. This is great work! 1) In the relOpPredCheck patch, you made this change to PredicateList.java: if (relop == null || ! relop.isQualifier(optTable, false)) + if (!pred.isRelationalOpPredicate() || + !pred.getRelop().isQualifier(optTable, false)) I'm concerned that this change may have a hole for a possible NPE. Is it possible that there could arise a case where we have an IN-list predicate at this point, such that relop is null but in-list is not null, and then pred.isRelationalOpPredicate() would return false, but pred.getRelop() would return null? 2) I spent some time studying the code in PredicateList.java which manipulates IN list operator predicates, and got a bit twisted around. There seems to be a lot of code which looks, more or less, like the following: RelationalOperator relop = pred.getRelop(); InListOperatorNode inNode = pred.getSourceInList(); boolean isIn = (inNode != null); if (relop == null) { /* if it's "in" operator, we generate dynamic start and stop key to improve index scan performance, beetle 3858. */ if (pred.getAndNode().getLeftOperand() instanceof InListOperatorNode && ! ((InListOperatorNode)pred.getAndNode().getLeftOperand()).getTransformed()) { isIn = true; inNode = (InListOperatorNode) pred.getAndNode().getLeftOperand(); } My naive reaction to code like this is "shouldn't this be handled in pred.getSourceInList()"? That is, it seems like there's a lot of code in PredicateList.java which calls pred.getSourceInList, but then does this other complicated "if" test to get the InListOperatorNode a different way. Can you help me understand why are there these two different ways to get the InListOperatorNode for a predicate (via getSourceInList() and via getAndNode.getLeftOperand() ), and what do you think about creating a "helper" function in Predicate.java, similar to getSourceInList, to clean up the 4 or 5 places in PredicateList.java where we have code like the above? 3) The new method PredicateList.generateInListValues() seems to go to some pains to handle the case where it is called, but this predicate list actually doesn't have any in list values to generate. My reaction was: "why is this routine getting called in this case?" It seems like NestedLoopJoinStrategy never calls generateInListValues unless there are actually in-list values to generate, so why does generateInListValues need to handle the case where there is no InListOperatorNode? That is, does the final "else" clause in generateInListValues ever actually occur? 4) The code which traverses predicate lists seems to always traverse them in backwards order, e.g. this code from HashJoinStrategy.java: for (int i = storeRestrictionList.size() - 1; i >= 0; i--) Why do we always traverse these backwards? Is this just an optimization in order to call the size() method only once? Or is there something deeper going on? 5) In InListOperatorNode, I at first thought that you had put this code in, but then after more study I saw that this was just an artifact of the patch program and you hadn't introduced this code, but just re-arranged it a bit. Still, the code showed up with "+" signs in the patch so I looked at it, and feel compelled to comment on it. This is at about line 413 in InListOperatorNode.java: //LocalField receiverField = // acb.newFieldDeclaration(Modifier.PRIVATE, receiverType); leftOperand.generateExpression(acb, mb); mb.dup(); //mb.putField(receiverField); // instance for method call / mb.getField(receiverField); / mb.upCast(leftInterfaceType); // first arg Do you know what is going on here? What is this commented out code, and can we clean it up? The last line in particular is quite frustrating with its snippet of actual code sandwiched in between two comments on the line. 6) I seem to recall some discussions on the mailing lists from people who had code generation systems which tended to generate horrific IN clauses with, for example, upwards of 1500 elements in the IN list. I think it would be nice to have a test or two which verified that we can actually compile and run such a SQL statement. You may have already done this, and I missed it, but most of the test cases in the addTestCases patch seemed to have IN list clauses with no more than 8-10 values in them. A test case like this, while hideously ugly, would help to ensure that we didn't have some problematic N^2 or N! handling of the IN list values somewhere that would explode if we had too many IN list values. 7) Generally speaking, I like the new convenience methods in Predicate.java, but I do worry about whether we need to start being concerned about overall efficiency in the optimizer. For example, consider a change like: RelationalOperator relop = pred.getRelop(); if (relop != null) + if (pred.isRelationalOpPredicate()) since isRelationalOpPredicate may end up calling getRelop() two more times, and since getRelop() calls getLeftOperand() twice and does an instanceof check, we're incrementally adding a fair amount of code. It's really important that the optimizer code be clear and easy to read, so I'm not suggesting any drastic measures. I'm just sort of thinking out loud about some concerns about how to write an optimizer which is simultaneously both clear and easy to read and understand (this is the most important) yet also efficient to run
          Hide
          A B added a comment -

          Bryan Pendleton (JIRA) wrote:
          > I finally got around to reading through the patches. I haven't actually
          > run any of the code, just given the patches a close reading

          Thank you very much for taking the time to do such a thorough review, Bryan. I definitely appreciate the feedback. My attempted answers are below, but if anything is still unclear, please do not hesitate to ask again. Better to get this right the first time

          > 1) In the relOpPredCheck patch, you made this change to PredicateList.java:
          >
          > - if (relop == null || ! relop.isQualifier(optTable, false))
          > + if (!pred.isRelationalOpPredicate() ||
          > + !pred.getRelop().isQualifier(optTable, false))
          >
          > I'm concerned that this change may have a hole for a possible NPE. Is it
          > possible that there could arise a case where we have an IN-list predicate
          > at this point, such that relop is null but in-list is not null, and then
          > pred.isRelationalOpPredicate() would return false, but pred.getRelop()
          > would return null?

          If we assume that we do in fact have a situation where "relop is null but in-list is not null", then pred.isRelationalOpPredicate() will, as you said, return false. That means that "!pred.isRelationalOpPredicate()" will return true, and since we are using a short-circuited OR operator, I don't think we would ever get to the pred.getRelop() call in that case, would we?

          The intent was that we only get to the second part of the OR if we know for a fact that pred.getRelop() will return a non-null value. And if "pred" is a relational op predicate (i.e. if the first part of the OR evaluates to "false") then that should always be the case. It is of course possible that I missed something and that the code doesn't follow the intent; please let me know if you can think of such a case...

          > 2) I spent some time studying the code in PredicateList.java which manipulates
          > IN list operator predicates, and got a bit twisted around.

          [ snip code fragment ]

          > My naive reaction to code like this is "shouldn't this be handled in
          > pred.getSourceInList()"?

          Good point. When I was making the changes I wrote "getSourceInList()" specifically for the new probe predicates; I was just leaving the existing InListOperatorNode logic as it was. But you're right, it's probably cleaner to expand that method to cover the "old" IN-list cases, as well. I will look into this for a follow-up patch.

          > Can you help me understand why are there these two different ways to get
          > the InListOperatorNode for a predicate (via getSourceInList() and via
          > getAndNode.getLeftOperand() ),

          The former (getSourceInList()) was added to handle DERBY-47 probe predicates while the latter (getAndNode.getLeftOperand()) was already there for handling the "normal" (pre DERBY-47) InLists. But you're right, it might be good to combine the two--I'll try to do that.

          > 3) The new method PredicateList.generateInListValues() seems to go to some
          > pains to handle the case where it is called, but this predicate list
          > actually doesn't have any in list values to generate. My reaction was:
          > "why is this routine getting called in this case?" It seems like
          > NestedLoopJoinStrategy never calls generateInListValues unless there are
          > actually in-list values to generate, so why does generateInListValues
          > need to handle the case where there is no InListOperatorNode?
          >
          > That is, does the final "else" clause in generateInListValues ever actually
          > occur?

          Great catch! When I first wrote the DERBY-47 changes I had all of the probing logic inside the TableScanResultSet class, and thus the additional arguments were always generated, even in cases where no probe predicates existed. I later realized that this was too much overhead since most TableScanResultSets will probably not be doing multi-probing. So I refactored the code and created a new result set, MultiProbeTableScanResultSet, which is only generated when probe predicates exist. But I forgot to remove the corresponding logic from the generateInListValues() method, hence the "else".

          So you're absolutely right: the "else" clause can be removed here and the code can be cleaned up accordingly. I'll address that in a follow-up patch.

          > 4) The code which traverses predicate lists seems to always traverse them
          > in backwards order, e.g. this code from HashJoinStrategy.java:
          >
          > for (int i = storeRestrictionList.size() - 1; i >= 0; i--)
          >
          > Why do we always traverse these backwards? Is this just an optimization
          > in order to call the size() method only once? Or is there something
          > deeper going on?

          This is just habit more than anything. A lot of the optimizer-related work that I've been doing requires the removal of certain predicates from predicate lists at various points in the code, and in that case reverse traversal is better (because removal of elements from the rear does not require adjustments to the loop index). So I often write loop iteration with backwards traversal out of sheer habit, even when removal of predicates is not happening. If you think this makes the code more confusing or less intuitive I have no problems with switching to forward traversal.

          > 5) In InListOperatorNode, I at first thought that you had put this code in,
          > but then after more study I saw that this was just an artifact of the
          > patch program and you hadn't introduced this code, but just re-arranged it
          > a bit. Still, the code showed up with "+" signs in the patch so I looked
          > at it, and feel compelled to comment on it.
          >
          > This is at about line 413 in InListOperatorNode.java:
          >
          > //LocalField receiverField =
          > // acb.newFieldDeclaration(Modifier.PRIVATE, receiverType);
          >
          > leftOperand.generateExpression(acb, mb);
          > mb.dup();
          > //mb.putField(receiverField); // instance for method call
          > /mb.getField(receiverField);/ mb.upCast(leftInterfaceType); // first arg
          >
          > Do you know what is going on here? What is this commented out code, and
          > can we clean it up? The last line in particular is quite frustrating with
          > its snippet of actual code sandwiched in between two comments on the line.

          Short answer is that I don't know what the commented out code is for, and my guess is that Yes, it can (and should) be cleaned up. But I frequently comment on other people's patches that they should refrain from making unrelated "cleanup" changes as it makes the patches harder to review, so I thought maybe I should follow my own advice Feel free to commit this two-line cleanup as a separate patch if you'd like--or I can do it myself, as well. Either way, so long as its a separate commit, I think that's a good idea.

          > 6) I seem to recall some discussions on the mailing lists from people who
          > had code generation systems which tended to generate horrific IN clauses
          > with, for example, upwards of 1500 elements in the IN list. I think it would
          > be nice to have a test or two which verified that we can actually compile
          > and run such a SQL statement. You may have already done this, and I missed
          > it, but most of the test cases in the addTestCases patch seemed to have
          > IN list clauses with no more than 8-10 values in them.

          There is an existing test case in lang/inbetween.sql that has such a test for "normal" (pre DERBY-47) IN-list processing. The query in that case has 4056 values (constants) in it.

          I was also looking at the repro attached to DERBY-47, which generates and executes IN-lists with up to 2500 and more values. I'd like to pull out the relevant pieces and include them in a new JUnit test as a way to verify that thing works for large IN lists when multi-probing is in effect, as well. That's still a forthcoming patch. For the record, though, I ran the repro on a database with 100,000 rows in it and a single IN-list of 2500 values, and everything worked fine--even when multi-probing was in effect. This is as I would expect since the code to generate the IN-list is the same before and after my changes.

          > 7) Generally speaking, I like the new convenience methods in Predicate.java,
          > but I do worry about whether we need to start being concerned about
          > overall efficiency in the optimizer. For example, consider a change like:
          >
          > RelationalOperator relop = pred.getRelop();
          >
          > - if (relop != null)
          > + if (pred.isRelationalOpPredicate())
          >
          > since isRelationalOpPredicate may end up calling getRelop() two more times,
          > and since getRelop() calls getLeftOperand() twice and does an instanceof
          > check, we're incrementally adding a fair amount of code.

          Great observation.

          > It's really important that the optimizer code be clear and easy to read,
          > so I'm not suggesting any drastic measures. I'm just sort of thinking
          > out loud about some concerns about how to write an optimizer which is
          > simultaneously both
          > - clear and easy to read and understand (this is the most important)
          > - yet also efficient to run

          This is a very valid point. I've been focusing on the first goal, but you're right, the above change is perhaps too costly. I'll look into remedying this with a follow-up patch...

          Thank you again for these excellent review comments. I really (really) appreciate the extra pair of the eyes and the great suggestions. As I said above, if any of the above answers are unclear or unsatisfactory, please ask again--I'll try to explain whatever it is that may not make sense.

          PS I plan to commit the exec_v1.plan patch later today (Monday) unless I hear objections from anyone...

          Show
          A B added a comment - Bryan Pendleton (JIRA) wrote: > I finally got around to reading through the patches. I haven't actually > run any of the code, just given the patches a close reading Thank you very much for taking the time to do such a thorough review, Bryan. I definitely appreciate the feedback. My attempted answers are below, but if anything is still unclear, please do not hesitate to ask again. Better to get this right the first time > 1) In the relOpPredCheck patch, you made this change to PredicateList.java: > > - if (relop == null || ! relop.isQualifier(optTable, false)) > + if (!pred.isRelationalOpPredicate() || > + !pred.getRelop().isQualifier(optTable, false)) > > I'm concerned that this change may have a hole for a possible NPE. Is it > possible that there could arise a case where we have an IN-list predicate > at this point, such that relop is null but in-list is not null, and then > pred.isRelationalOpPredicate() would return false, but pred.getRelop() > would return null? If we assume that we do in fact have a situation where "relop is null but in-list is not null", then pred.isRelationalOpPredicate() will, as you said, return false. That means that "!pred.isRelationalOpPredicate()" will return true, and since we are using a short-circuited OR operator, I don't think we would ever get to the pred.getRelop() call in that case, would we? The intent was that we only get to the second part of the OR if we know for a fact that pred.getRelop() will return a non-null value. And if "pred" is a relational op predicate (i.e. if the first part of the OR evaluates to "false") then that should always be the case. It is of course possible that I missed something and that the code doesn't follow the intent; please let me know if you can think of such a case... > 2) I spent some time studying the code in PredicateList.java which manipulates > IN list operator predicates, and got a bit twisted around. [ snip code fragment ] > My naive reaction to code like this is "shouldn't this be handled in > pred.getSourceInList()"? Good point. When I was making the changes I wrote "getSourceInList()" specifically for the new probe predicates; I was just leaving the existing InListOperatorNode logic as it was. But you're right, it's probably cleaner to expand that method to cover the "old" IN-list cases, as well. I will look into this for a follow-up patch. > Can you help me understand why are there these two different ways to get > the InListOperatorNode for a predicate (via getSourceInList() and via > getAndNode.getLeftOperand() ), The former (getSourceInList()) was added to handle DERBY-47 probe predicates while the latter (getAndNode.getLeftOperand()) was already there for handling the "normal" (pre DERBY-47 ) InLists. But you're right, it might be good to combine the two--I'll try to do that. > 3) The new method PredicateList.generateInListValues() seems to go to some > pains to handle the case where it is called, but this predicate list > actually doesn't have any in list values to generate. My reaction was: > "why is this routine getting called in this case?" It seems like > NestedLoopJoinStrategy never calls generateInListValues unless there are > actually in-list values to generate, so why does generateInListValues > need to handle the case where there is no InListOperatorNode? > > That is, does the final "else" clause in generateInListValues ever actually > occur? Great catch! When I first wrote the DERBY-47 changes I had all of the probing logic inside the TableScanResultSet class, and thus the additional arguments were always generated, even in cases where no probe predicates existed. I later realized that this was too much overhead since most TableScanResultSets will probably not be doing multi-probing. So I refactored the code and created a new result set, MultiProbeTableScanResultSet, which is only generated when probe predicates exist. But I forgot to remove the corresponding logic from the generateInListValues() method, hence the "else". So you're absolutely right: the "else" clause can be removed here and the code can be cleaned up accordingly. I'll address that in a follow-up patch. > 4) The code which traverses predicate lists seems to always traverse them > in backwards order, e.g. this code from HashJoinStrategy.java: > > for (int i = storeRestrictionList.size() - 1; i >= 0; i--) > > Why do we always traverse these backwards? Is this just an optimization > in order to call the size() method only once? Or is there something > deeper going on? This is just habit more than anything. A lot of the optimizer-related work that I've been doing requires the removal of certain predicates from predicate lists at various points in the code, and in that case reverse traversal is better (because removal of elements from the rear does not require adjustments to the loop index). So I often write loop iteration with backwards traversal out of sheer habit, even when removal of predicates is not happening. If you think this makes the code more confusing or less intuitive I have no problems with switching to forward traversal. > 5) In InListOperatorNode, I at first thought that you had put this code in, > but then after more study I saw that this was just an artifact of the > patch program and you hadn't introduced this code, but just re-arranged it > a bit. Still, the code showed up with "+" signs in the patch so I looked > at it, and feel compelled to comment on it. > > This is at about line 413 in InListOperatorNode.java: > > //LocalField receiverField = > // acb.newFieldDeclaration(Modifier.PRIVATE, receiverType); > > leftOperand.generateExpression(acb, mb); > mb.dup(); > //mb.putField(receiverField); // instance for method call > / mb.getField(receiverField); / mb.upCast(leftInterfaceType); // first arg > > Do you know what is going on here? What is this commented out code, and > can we clean it up? The last line in particular is quite frustrating with > its snippet of actual code sandwiched in between two comments on the line. Short answer is that I don't know what the commented out code is for, and my guess is that Yes, it can (and should) be cleaned up. But I frequently comment on other people's patches that they should refrain from making unrelated "cleanup" changes as it makes the patches harder to review, so I thought maybe I should follow my own advice Feel free to commit this two-line cleanup as a separate patch if you'd like--or I can do it myself, as well. Either way, so long as its a separate commit, I think that's a good idea. > 6) I seem to recall some discussions on the mailing lists from people who > had code generation systems which tended to generate horrific IN clauses > with, for example, upwards of 1500 elements in the IN list. I think it would > be nice to have a test or two which verified that we can actually compile > and run such a SQL statement. You may have already done this, and I missed > it, but most of the test cases in the addTestCases patch seemed to have > IN list clauses with no more than 8-10 values in them. There is an existing test case in lang/inbetween.sql that has such a test for "normal" (pre DERBY-47 ) IN-list processing. The query in that case has 4056 values (constants) in it. I was also looking at the repro attached to DERBY-47 , which generates and executes IN-lists with up to 2500 and more values. I'd like to pull out the relevant pieces and include them in a new JUnit test as a way to verify that thing works for large IN lists when multi-probing is in effect, as well. That's still a forthcoming patch. For the record, though, I ran the repro on a database with 100,000 rows in it and a single IN-list of 2500 values, and everything worked fine--even when multi-probing was in effect. This is as I would expect since the code to generate the IN-list is the same before and after my changes. > 7) Generally speaking, I like the new convenience methods in Predicate.java, > but I do worry about whether we need to start being concerned about > overall efficiency in the optimizer. For example, consider a change like: > > RelationalOperator relop = pred.getRelop(); > > - if (relop != null) > + if (pred.isRelationalOpPredicate()) > > since isRelationalOpPredicate may end up calling getRelop() two more times, > and since getRelop() calls getLeftOperand() twice and does an instanceof > check, we're incrementally adding a fair amount of code. Great observation. > It's really important that the optimizer code be clear and easy to read, > so I'm not suggesting any drastic measures. I'm just sort of thinking > out loud about some concerns about how to write an optimizer which is > simultaneously both > - clear and easy to read and understand (this is the most important) > - yet also efficient to run This is a very valid point. I've been focusing on the first goal, but you're right, the above change is perhaps too costly. I'll look into remedying this with a follow-up patch... Thank you again for these excellent review comments. I really (really) appreciate the extra pair of the eyes and the great suggestions. As I said above, if any of the above answers are unclear or unsatisfactory, please ask again--I'll try to explain whatever it is that may not make sense. PS I plan to commit the exec_v1.plan patch later today (Monday) unless I hear objections from anyone...
          Hide
          A B added a comment -

          Committed d47_mp_exec_v1.patch with svn 517470:

          URL: http://svn.apache.org/viewvc?view=rev&rev=517470

          Attaching d47_mp_preprocess_v1.patch, which is (finally) the patch that actually creates "probe predicates" during preprocessing, thus allowing the code changes in all previous patches to take effect. Once this patch is committed Derby will start re-writing IN lists as probe predicates and, if the optimizer thinks it is best to do so, will start doing index "multi-probing" at execution time to avoid excessive scanning. The changes in this patch affect "preprocessing" logic as follow:

          1. Replaces "A" with "B", where "A" is existing logic that creates a BETWEEN
          node for IN-lists containing all constants, and "B" is new logic that
          creates a "probe predicate" for IN-lists containing all constants and/or
          parameter nodes. The probe predicates are then used throughout optimization,
          modification of access paths, code generation, and execution time (as
          appropriate) in the manner described by previous patches.

          2. Adds some additional logic to OrNode preprocessing to allow the conversion
          of queries like:

          select ... from T1 where i in (2, 3) or i in (7, 10)

          into queries that look like:

          select ... from T1 where i in (2, 3, 7, 10)

          This is really just an extension of the existing logic to transform a
          chain of OR nodes into an IN-list.

          3. Adds logic to PredicateList.pushExpressionsIntoSelect() to correctly
          copy "probe predicates" so that the left operand (column reference)
          is pointing to the correct place when we do static pushing of one-
          sided predicates (which is what a "probe predicate" is).

          4. Adds a new method to ValueNodeList that is used for checking to see if
          a list of IN values consists solely of constant and/or parameter nodes
          (there are no other expressions or column references).

          I'm also attaching a corresponding patch, d47_mp_masters_v1.patch, which contains all of the diffs caused by the new multi-probing functionality. As is typical with tests that print out query plans, a simple change in the execution-time behavior can lead to massive diffs. I manually looked at all of the diffs in the masters_v1.patch and it is my belief that all but one of them are acceptable and expected given the changes for this issue. The one exception is for store/readlocks.sql, which is a test about which I know very little. My guess (or perhaps more accurately, my hope) is that this readlocks diff is okay, but it would be great if someone who knows more about it could verify.

          Note that I've separated preprocess_v1.patch from masters_v1.patch for ease of review, but they both need to be committed at the same time in order to avoid failures in the nightly regression tests.

          As always, I'd appreciate it if anyone has the time look these changes over and make comments. If there are no objections I plan to commit these two patches Wednesday afternoon (March 14th, PST).

          Remaining tasks once preprocess_v1 is committed:

          • Address the (excellent) review comments raised by Bryan in one or more
            follow-up patches.
          • Add test cases to verify that Derby is now behaving as desired in the
            face of IN list restrictions on columns with useful indexes.
          • Post some simple numbers to show the improvement that I see when running
            Derby47PerformanceTest.java (attached to this issue) before and after
            the changes for this issue. Also, discuss a couple of areas to investigate
            post-DERBY-47 (i.e. things to consider as separate Jira issues).

          I'm hoping to have most of this work posted by the end of the week, but of course that all depends on the feedback I receive and the amount of other "stuff" that comes up between now and then.

          Show
          A B added a comment - Committed d47_mp_exec_v1.patch with svn 517470: URL: http://svn.apache.org/viewvc?view=rev&rev=517470 Attaching d47_mp_preprocess_v1.patch, which is (finally) the patch that actually creates "probe predicates" during preprocessing, thus allowing the code changes in all previous patches to take effect. Once this patch is committed Derby will start re-writing IN lists as probe predicates and, if the optimizer thinks it is best to do so, will start doing index "multi-probing" at execution time to avoid excessive scanning. The changes in this patch affect "preprocessing" logic as follow: 1. Replaces "A" with "B", where "A" is existing logic that creates a BETWEEN node for IN-lists containing all constants, and "B" is new logic that creates a "probe predicate" for IN-lists containing all constants and/or parameter nodes. The probe predicates are then used throughout optimization, modification of access paths, code generation, and execution time (as appropriate) in the manner described by previous patches. 2. Adds some additional logic to OrNode preprocessing to allow the conversion of queries like: select ... from T1 where i in (2, 3) or i in (7, 10) into queries that look like: select ... from T1 where i in (2, 3, 7, 10) This is really just an extension of the existing logic to transform a chain of OR nodes into an IN-list. 3. Adds logic to PredicateList.pushExpressionsIntoSelect() to correctly copy "probe predicates" so that the left operand (column reference) is pointing to the correct place when we do static pushing of one- sided predicates (which is what a "probe predicate" is). 4. Adds a new method to ValueNodeList that is used for checking to see if a list of IN values consists solely of constant and/or parameter nodes (there are no other expressions or column references). I'm also attaching a corresponding patch, d47_mp_masters_v1.patch, which contains all of the diffs caused by the new multi-probing functionality. As is typical with tests that print out query plans, a simple change in the execution-time behavior can lead to massive diffs. I manually looked at all of the diffs in the masters_v1.patch and it is my belief that all but one of them are acceptable and expected given the changes for this issue. The one exception is for store/readlocks.sql, which is a test about which I know very little. My guess (or perhaps more accurately, my hope ) is that this readlocks diff is okay, but it would be great if someone who knows more about it could verify. Note that I've separated preprocess_v1.patch from masters_v1.patch for ease of review, but they both need to be committed at the same time in order to avoid failures in the nightly regression tests. As always, I'd appreciate it if anyone has the time look these changes over and make comments. If there are no objections I plan to commit these two patches Wednesday afternoon (March 14th, PST). Remaining tasks once preprocess_v1 is committed: Address the (excellent) review comments raised by Bryan in one or more follow-up patches. Add test cases to verify that Derby is now behaving as desired in the face of IN list restrictions on columns with useful indexes. Post some simple numbers to show the improvement that I see when running Derby47PerformanceTest.java (attached to this issue) before and after the changes for this issue. Also, discuss a couple of areas to investigate post- DERBY-47 (i.e. things to consider as separate Jira issues). I'm hoping to have most of this work posted by the end of the week, but of course that all depends on the feedback I receive and the amount of other "stuff" that comes up between now and then.
          Hide
          A B added a comment -

          Atttaching the readlocks diff with more context to hopefully help in reviewing...

          Show
          A B added a comment - Atttaching the readlocks diff with more context to hopefully help in reviewing...
          Hide
          Bryan Pendleton added a comment -

          Hi Army, I had a look at the preprocess and master patches and they look
          good to me. Here's a couple small thoughts I had which you might want to
          consider down the road.

          0) You are right about the compound IF statement that we discussed in the
          previous set of comments. I misread the logic, and I agree that there is
          not any NPE hole there. Thanks for the further explanation.

          1) In one of the comments, you said:

          We intentionally use a parameter node instead of a constant node
          because the IN-list has more than one value

          It's not always true that the IN-list has more than one value, right?
          That is, it would be legal, if not very useful, to write

          SELECT * FROM t WHERE c IN (1)

          As I read the code near that comment, it didn't seem that you were actually
          depending on the IN-list having more than one value. Rather, you were
          choosing a data structure which could handle multiple values, but which can
          handle a single value just as well.

          But I thought I'd mention it just to be sure.

          2) Do you have any test cases of the form

          WHERE c IN (SELECT c_prime from t_prime)

          That is, where the IN-list is neither a list of literal constants, nor a
          list of parameter markers, but is instead a subquery whose values will
          be used as the list.

          Does such a query generate and use the new style Multi-Probe processing?

          3) Do you have any test cases in which the IN-list predicate references a
          column in a UNION or a UNION view, thus requiring pushing the IN-list
          predicate down and pulling it back up?

          4) In the change to OrNode.java, my eye was caught by the variable name "beon".
          I've seen that a common convention is to use an acronym, so maybe "beon"
          stood for Binary Equality Operator Node, or something like that, but the
          actual datatype is Binary*Relational*OperatorNode, so I would have
          expected to see a variable "bron".

          BinaryRelationalOperatorNode beon = null;

          It's totally unimportant, but I saw it and thought I'd mention it.

          5) I looked at the updated masters; to my eye they show that the new probe
          predicate is working, and the optimizer is choosing to use index-to-base
          processing for these predicates rather than the formerly-chosen table
          scans, so this looks great to me.

          6) I have no guidance to offer regarding the readlocks diff, sorry.

          7) For some reason, I expected to see something more vivid indicating the
          use of the new execution strategy in the query plans, I thought maybe I'd
          see something like "MultiProbeTableScanResultSet" in query dumps? Is it
          just these tests that don't show that, and other query dumps would? Or is
          the only indication that the new probing code has been chosen the use of
          the index in place of the table scan?

          Show
          Bryan Pendleton added a comment - Hi Army, I had a look at the preprocess and master patches and they look good to me. Here's a couple small thoughts I had which you might want to consider down the road. 0) You are right about the compound IF statement that we discussed in the previous set of comments. I misread the logic, and I agree that there is not any NPE hole there. Thanks for the further explanation. 1) In one of the comments, you said: We intentionally use a parameter node instead of a constant node because the IN-list has more than one value It's not always true that the IN-list has more than one value, right? That is, it would be legal, if not very useful, to write SELECT * FROM t WHERE c IN (1) As I read the code near that comment, it didn't seem that you were actually depending on the IN-list having more than one value. Rather, you were choosing a data structure which could handle multiple values, but which can handle a single value just as well. But I thought I'd mention it just to be sure. 2) Do you have any test cases of the form WHERE c IN (SELECT c_prime from t_prime) That is, where the IN-list is neither a list of literal constants, nor a list of parameter markers, but is instead a subquery whose values will be used as the list. Does such a query generate and use the new style Multi-Probe processing? 3) Do you have any test cases in which the IN-list predicate references a column in a UNION or a UNION view, thus requiring pushing the IN-list predicate down and pulling it back up? 4) In the change to OrNode.java, my eye was caught by the variable name "beon". I've seen that a common convention is to use an acronym, so maybe "beon" stood for Binary Equality Operator Node, or something like that, but the actual datatype is Binary*Relational*OperatorNode, so I would have expected to see a variable "bron". BinaryRelationalOperatorNode beon = null; It's totally unimportant, but I saw it and thought I'd mention it. 5) I looked at the updated masters; to my eye they show that the new probe predicate is working, and the optimizer is choosing to use index-to-base processing for these predicates rather than the formerly-chosen table scans, so this looks great to me. 6) I have no guidance to offer regarding the readlocks diff, sorry. 7) For some reason, I expected to see something more vivid indicating the use of the new execution strategy in the query plans, I thought maybe I'd see something like "MultiProbeTableScanResultSet" in query dumps? Is it just these tests that don't show that, and other query dumps would? Or is the only indication that the new probing code has been chosen the use of the index in place of the table scan?
          Hide
          A B added a comment -

          > 1) In one of the comments, you said:
          >
          > We intentionally use a parameter node instead of a constant node
          > because the IN-list has more than one value
          >
          > It's not always true that the IN-list has more than one value, right?
          > That is, it would be legal, if not very useful, to write
          >
          > SELECT * FROM t WHERE c IN (1)

          Yes, an IN-list can only have a single value. However, if such an IN-list occurs we will convert it into an equality predicate as part of the first "if" branch in the preprocess() method:

          /* Check for the degenerate case of a single element in the IN list.

          • If found, then convert to "=".
            */
            if (rightOperandList.size() == 1)
            ...

          Thus we won't ever get to the code referenced above. But you're right, it might be good to add an explanatory comment to explain this.

          > 2) Do you have any test cases of the form
          >
          > WHERE c IN (SELECT c_prime from t_prime)
          >
          > That is, where the IN-list is neither a list of literal constants, nor a
          > list of parameter markers, but is instead a subquery whose values will
          > be used as the list.

          A few test cases for this kind of query already exist in lang/inbetween.sql and also in lang/subquery.sql. So I did not add any new test cases with my changes. Is there anything in particular you think could be problematic here?

          > Does such a query generate and use the new style Multi-Probe processing?

          No, it does not. Multi-probe processing only occurs if the IN-list is solely comprised of constant and/or parameter nodes. A subquery is neither constant nor parameter, hence no multi-probing will occur. The code for this is in the preprocess() method of InListOperatorNode:

          else if ((leftOperand instanceof ColumnReference) &&
          rightOperandList.containsOnlyConstantAndParamNodes())

          { <create probe predicate> }

          > 3) Do you have any test cases in which the IN-list predicate references a
          > column in a UNION or a UNION view, thus requiring pushing the IN-list
          > predicate down and pulling it back up?

          I added a few, simplified test queries for this situation to lang/inbetween.sql as part of d47_mp_addlTestCases.patch (under the heading "Nested queries with unions and top-level IN list"). There are also a handful of queries in lang/predicatePushdown.sql that include IN lists in addition to equality predicates. Feel free to comment if you think more testing should be done here...

          > 4) In the change to OrNode.java, my eye was caught by the variable name "beon".
          > I've seen that a common convention is to use an acronym, so maybe "beon"
          > stood for Binary Equality Operator Node, or something like that, but the
          > actual datatype is Binary*Relational*OperatorNode, so I would have
          > expected to see a variable "bron".
          >
          > BinaryRelationalOperatorNode beon = null;
          >
          > It's totally unimportant, but I saw it and thought I'd mention it.

          Oops, good catch Just a typo, I will try to fix this up.

          > 5) I looked at the updated masters; to my eye they show that the new probe
          > predicate is working, and the optimizer is choosing to use index-to-base
          > processing for these predicates rather than the formerly-chosen table
          > scans, so this looks great to me.

          Thank you for taking a look at these--this was an unexpected but very welcome review!

          > 7) For some reason, I expected to see something more vivid indicating the
          > use of the new execution strategy in the query plans, I thought maybe I'd
          > see something like "MultiProbeTableScanResultSet" in query dumps? Is it
          > just these tests that don't show that, and other query dumps would?

          This had occurred to me, as well, but then I realized that in all of the test cases the MultiProbingTableScanResultSet shows up as the child of an IndexToBaseRowResultSet, which I think means it doesn't actually get printed in the query plans (only the info for the IndexToBaseRowResultSet is printed). This is simliar, I think, to BulkTableScanResultSet, which is not printed in query plans, either (I don't think...I believe the only way to tell is by looking at the fetch size).

          > Or is the only indication that the new probing code has been chosen the
          > use of the index in place of the table scan?

          Right now I think the only way to tell is to look at the use of index and the number of rows visited/qualified: if we used an index but did not do multi-probing, we will probably see far more rows visited than if we used multi-probing (that won't always be the case, but is generally true). Perhaps work to add an explicit indication of multi-probing to the query plan can be handled as a separate enhancement?

          These are great observations and great questions--thank you for posing them! And if you have any others, please continue to ask. I'm definitely grateful for the feedback...

          Show
          A B added a comment - > 1) In one of the comments, you said: > > We intentionally use a parameter node instead of a constant node > because the IN-list has more than one value > > It's not always true that the IN-list has more than one value, right? > That is, it would be legal, if not very useful, to write > > SELECT * FROM t WHERE c IN (1) Yes, an IN-list can only have a single value. However, if such an IN-list occurs we will convert it into an equality predicate as part of the first "if" branch in the preprocess() method: /* Check for the degenerate case of a single element in the IN list. If found, then convert to "=". */ if (rightOperandList.size() == 1) ... Thus we won't ever get to the code referenced above. But you're right, it might be good to add an explanatory comment to explain this. > 2) Do you have any test cases of the form > > WHERE c IN (SELECT c_prime from t_prime) > > That is, where the IN-list is neither a list of literal constants, nor a > list of parameter markers, but is instead a subquery whose values will > be used as the list. A few test cases for this kind of query already exist in lang/inbetween.sql and also in lang/subquery.sql. So I did not add any new test cases with my changes. Is there anything in particular you think could be problematic here? > Does such a query generate and use the new style Multi-Probe processing? No, it does not. Multi-probe processing only occurs if the IN-list is solely comprised of constant and/or parameter nodes. A subquery is neither constant nor parameter, hence no multi-probing will occur. The code for this is in the preprocess() method of InListOperatorNode: else if ((leftOperand instanceof ColumnReference) && rightOperandList.containsOnlyConstantAndParamNodes()) { <create probe predicate> } > 3) Do you have any test cases in which the IN-list predicate references a > column in a UNION or a UNION view, thus requiring pushing the IN-list > predicate down and pulling it back up? I added a few, simplified test queries for this situation to lang/inbetween.sql as part of d47_mp_addlTestCases.patch (under the heading "Nested queries with unions and top-level IN list"). There are also a handful of queries in lang/predicatePushdown.sql that include IN lists in addition to equality predicates. Feel free to comment if you think more testing should be done here... > 4) In the change to OrNode.java, my eye was caught by the variable name "beon". > I've seen that a common convention is to use an acronym, so maybe "beon" > stood for Binary Equality Operator Node, or something like that, but the > actual datatype is Binary*Relational*OperatorNode, so I would have > expected to see a variable "bron". > > BinaryRelationalOperatorNode beon = null; > > It's totally unimportant, but I saw it and thought I'd mention it. Oops, good catch Just a typo, I will try to fix this up. > 5) I looked at the updated masters; to my eye they show that the new probe > predicate is working, and the optimizer is choosing to use index-to-base > processing for these predicates rather than the formerly-chosen table > scans, so this looks great to me. Thank you for taking a look at these--this was an unexpected but very welcome review! > 7) For some reason, I expected to see something more vivid indicating the > use of the new execution strategy in the query plans, I thought maybe I'd > see something like "MultiProbeTableScanResultSet" in query dumps? Is it > just these tests that don't show that, and other query dumps would? This had occurred to me, as well, but then I realized that in all of the test cases the MultiProbingTableScanResultSet shows up as the child of an IndexToBaseRowResultSet, which I think means it doesn't actually get printed in the query plans (only the info for the IndexToBaseRowResultSet is printed). This is simliar, I think, to BulkTableScanResultSet, which is not printed in query plans, either (I don't think...I believe the only way to tell is by looking at the fetch size). > Or is the only indication that the new probing code has been chosen the > use of the index in place of the table scan? Right now I think the only way to tell is to look at the use of index and the number of rows visited/qualified: if we used an index but did not do multi-probing, we will probably see far more rows visited than if we used multi-probing (that won't always be the case, but is generally true). Perhaps work to add an explicit indication of multi-probing to the query plan can be handled as a separate enhancement? These are great observations and great questions--thank you for posing them! And if you have any others, please continue to ask. I'm definitely grateful for the feedback...
          Hide
          Mike Matrigali added a comment -

          I took a look at the diffs in readlocks and I believe all are "correct" with respect to your changes. It looks to me like there is an existing bug not affected by your code, see case 6 below. The readlocks test runs the same set of tests through multiple
          setups varying isolation level, rows size and unique vs. non-unique index. In all cases the diffs you are seeing are due to
          a test of locks gotten from walking a cursor down the result set from: select a from a where a = 5 or a = 7' . As described it is expected that you IN logic change should apply to this query when there is an index on a. I have described what I think is happening in the diffs below, diff line numbers come from applying your master patch to a current codeline and then running svn diff.

          Diff line notation is from a svn diff after applying the master patch
          posted to DERBY-47

          1) diff lines:4946,7 +4946,7 @@^M
          o diff ok
          o before change first next would bulk load all rows leaving a "scan lock"
          on the last page (3,1). Now on first next code does a probe for the
          5 from the (a=5 or a=7), so first lock query shows scan lock on (2,1)
          associated with 5. There are no real row locks as this is read
          uncommitted test.

          2) @@ -8103,6 +8103,7 @@^M
          @@ -8112,6 +8113,7 @@^M
          o diff ok, shows 1 more row lock after each next in an expected OR case
          o before change first next would bulk load all rows and by time lock
          query is executed all locks would be released due to read committed.
          Now because of probing we only get locks as we probe each value and
          test shows lock is held during next call and then released when we
          next to the next value.

          3) @@ -8956,6 +8958,7 @@^M
          @@ -8965,6 +8968,7 @@^M
          o diff ok, same reasoning as 2

          4) @@ -11255,7 +11259,8 @@^
          @@ -11265,6 +11270,7 @@^M
          o diff ok, same reasoning as 2 - row numbers are different from 2 because
          of different padding in test table.

          5) @@ -12101,6 +12107,7 @@^M
          @@ -12110,6 +12117,7 @@^M
          o diff ok, same reasoning as 4

          6) @@ -14746,7 +14754,6 @@^M
          o I think there is a bug in existing code, the incremental diff looks ok.
          o there should not be any locks left after the scan_cursor is closed in
          read committed but there at line 14762 of original test.

          7) @@ -15752,7 +15759,6 @@^M
          o same as 6

          8) @@ -18421,9 +18427,8 @@^M

          o same as 6

          9) @@ -19421,7 +19426,6 @@^M
          o same as 6

          10) @@ -21779,8 +21783,6 @@^M]
          o diff ok!!
          o this is a good test that shows that the new code only visits the 2 rows
          of the or clause and does not get locks on any other rows under
          serializable with a unique index.
          Old change shows it scaning the range an unnecessarily
          locking an extra row.

          11) @@ -21791,7 +21793,6 @@^M
          o diff ok
          o same as 10

          12) @@ -21799,7 +21800,6 @@^
          o diff ok
          o same as 10

          13) @@ -22639,8 +22639,6 @@^M
          o diff ok
          o not as good a test as 10. Because of previous key locking and the very
          small data set both before and after we lock the same number of rows.
          Diff does show difference in processing between before and after. If
          there had been more than one row between 5 and 7 with the non-unique
          index it would have shown less rows locked under new code vs. old code.
          Adding a test for "IN(1, 7)" would show this off. If you are going to
          add new test I would suggest checking in current set of diffs and then
          adding separate test as it is easier to identify diffs from
          new tests.

          14) @@ -24974,11 +24972,9 @@^M
          o diff ok
          o same as 13

          15) @@ -25831,8 +25827,6 @@^M
          o same as 13

          Show
          Mike Matrigali added a comment - I took a look at the diffs in readlocks and I believe all are "correct" with respect to your changes. It looks to me like there is an existing bug not affected by your code, see case 6 below. The readlocks test runs the same set of tests through multiple setups varying isolation level, rows size and unique vs. non-unique index. In all cases the diffs you are seeing are due to a test of locks gotten from walking a cursor down the result set from: select a from a where a = 5 or a = 7' . As described it is expected that you IN logic change should apply to this query when there is an index on a. I have described what I think is happening in the diffs below, diff line numbers come from applying your master patch to a current codeline and then running svn diff. Diff line notation is from a svn diff after applying the master patch posted to DERBY-47 1) diff lines:4946,7 +4946,7 @@^M o diff ok o before change first next would bulk load all rows leaving a "scan lock" on the last page (3,1). Now on first next code does a probe for the 5 from the (a=5 or a=7), so first lock query shows scan lock on (2,1) associated with 5. There are no real row locks as this is read uncommitted test. 2) @@ -8103,6 +8103,7 @@^M @@ -8112,6 +8113,7 @@^M o diff ok, shows 1 more row lock after each next in an expected OR case o before change first next would bulk load all rows and by time lock query is executed all locks would be released due to read committed. Now because of probing we only get locks as we probe each value and test shows lock is held during next call and then released when we next to the next value. 3) @@ -8956,6 +8958,7 @@^M @@ -8965,6 +8968,7 @@^M o diff ok, same reasoning as 2 4) @@ -11255,7 +11259,8 @@^ @@ -11265,6 +11270,7 @@^M o diff ok, same reasoning as 2 - row numbers are different from 2 because of different padding in test table. 5) @@ -12101,6 +12107,7 @@^M @@ -12110,6 +12117,7 @@^M o diff ok, same reasoning as 4 6) @@ -14746,7 +14754,6 @@^M o I think there is a bug in existing code, the incremental diff looks ok. o there should not be any locks left after the scan_cursor is closed in read committed but there at line 14762 of original test. 7) @@ -15752,7 +15759,6 @@^M o same as 6 8) @@ -18421,9 +18427,8 @@^M o same as 6 9) @@ -19421,7 +19426,6 @@^M o same as 6 10) @@ -21779,8 +21783,6 @@^M] o diff ok!! o this is a good test that shows that the new code only visits the 2 rows of the or clause and does not get locks on any other rows under serializable with a unique index. Old change shows it scaning the range an unnecessarily locking an extra row. 11) @@ -21791,7 +21793,6 @@^M o diff ok o same as 10 12) @@ -21799,7 +21800,6 @@^ o diff ok o same as 10 13) @@ -22639,8 +22639,6 @@^M o diff ok o not as good a test as 10. Because of previous key locking and the very small data set both before and after we lock the same number of rows. Diff does show difference in processing between before and after. If there had been more than one row between 5 and 7 with the non-unique index it would have shown less rows locked under new code vs. old code. Adding a test for "IN(1, 7)" would show this off. If you are going to add new test I would suggest checking in current set of diffs and then adding separate test as it is easier to identify diffs from new tests. 14) @@ -24974,11 +24972,9 @@^M o diff ok o same as 13 15) @@ -25831,8 +25827,6 @@^M o same as 13
          Hide
          Mike Matrigali added a comment -

          In my writeup on the diffs, ignore the comments about there being a possible bug. I incorrectly thought those cases were
          read committed cases but they actually were repeatable read cases. I was searching for the wrong string to determine where I was in the test.

          All the diffs look fine and expected with army's changes.

          Show
          Mike Matrigali added a comment - In my writeup on the diffs, ignore the comments about there being a possible bug. I incorrectly thought those cases were read committed cases but they actually were repeatable read cases. I was searching for the wrong string to determine where I was in the test. All the diffs look fine and expected with army's changes.
          Hide
          A B added a comment -

          Thanks again for the excellent review, Mike.

          Unless I hear otherwise I plan to commit the preprocess patch later today, after incorporating Bryan's most recent comments. I will then work on the follow-up patch(es) to address Bryan's original set of review comments (thanks Bryan!).

          And finally, I will try to add a new test to verify the functional changes. That said, I was hoping to add a new test to the regression suite based on Derby47PerformanceTest.java as attached to this issue. However, I just noticed that the attachment does not grant license to ASF for inclusion in ASF works.

          James Synge, are you willing to grant such rights for the test program that you attached? If so, can you re-attach the file and check the appropriate box on the "Attach File" screen?

          Show
          A B added a comment - Thanks again for the excellent review, Mike. Unless I hear otherwise I plan to commit the preprocess patch later today, after incorporating Bryan's most recent comments. I will then work on the follow-up patch(es) to address Bryan's original set of review comments (thanks Bryan!). And finally, I will try to add a new test to verify the functional changes. That said, I was hoping to add a new test to the regression suite based on Derby47PerformanceTest.java as attached to this issue. However, I just noticed that the attachment does not grant license to ASF for inclusion in ASF works. James Synge, are you willing to grant such rights for the test program that you attached? If so, can you re-attach the file and check the appropriate box on the "Attach File" screen?
          Hide
          James Synge added a comment -

          This is the same as the file I attached on 2006-09-06, but now with the license granted to ASF.

          Show
          James Synge added a comment - This is the same as the file I attached on 2006-09-06, but now with the license granted to ASF.
          Hide
          A B added a comment -

          I made some slight modifications to the preprocess patch in accordance with Bryan's review comments, and then committed with svn #518322:

          URL: http://svn.apache.org/viewvc?view=rev&rev=518322

          More specifically, preprocess_v2 differs from preprocess_v1 in that it:

          1 - Adds a comment to help clarify what happens in the case of an IN-list with a single value in it.

          2 - Renames "beon" to "bron" in OrNode.java to reflect the fact that it is a BinaryRelationalOperatorNode, not a BinaryEqualityOperatorNode.

          derbyall ran cleanly on Red Hat Linux with ibm142.

          Thanks again to Bryan and Mike for the reviews.

          Show
          A B added a comment - I made some slight modifications to the preprocess patch in accordance with Bryan's review comments, and then committed with svn #518322: URL: http://svn.apache.org/viewvc?view=rev&rev=518322 More specifically, preprocess_v2 differs from preprocess_v1 in that it: 1 - Adds a comment to help clarify what happens in the case of an IN-list with a single value in it. 2 - Renames "beon" to "bron" in OrNode.java to reflect the fact that it is a BinaryRelationalOperatorNode, not a BinaryEqualityOperatorNode. derbyall ran cleanly on Red Hat Linux with ibm142. Thanks again to Bryan and Mike for the reviews.
          Hide
          A B added a comment -

          Attaching d47_mp_cleanup_v1.patch, which is a patch to address the review comments made by Bryan on 11/Mar/07 11:37 AM. In particular this patch does the following:

          1 - Changes Predicate.isRelationalOpPredicate() so that it just calls
          the already existing method "isRelationalOperator()" on the left
          operand of the predicate's AND node. I.e.:

          • return ((getRelop() != null) && (getSourceInList() == null));
            + return andNode.getLeftOperand().isRelationalOperator();

          I completely forgot that the "isRelationalOperator()" method already
          existed, even though I myself made probe-predicate-based changes to
          that method as part of d47_mp_relOpPredCheck:

          public boolean isRelationalOperator()

          { - return true; + /* If this rel op is for a probe predicate then we do not call + * it a "relational operator"; it's actually a disguised IN-list + * operator. + */ + return (inListProbeSource == null); + }

          As a result of those changes we can now just call that method when
          checking to see if a predicate is a relational op predicate. This
          ultimately comes down to a simple check for a null variable in
          BinaryRelationalOperatorNode, as seen above.

          I believe this change addresses Bryan's comment #7, which pointed
          out that the old code:

          return ((getRelop() != null) && (getSourceInList() == null));

          seemed a tad expensive since it was replacing a simple call to
          "relop != null". The new code (with this patch) is much more
          comparable to the "relop != null" check in terms of "work" that
          it does (we have an additional call to getLeftOperand(), but
          that's about it).

          2 - Inspired by the "isRelationalOperator()" method defined in ValueNode
          and used above, I added a similar method, "isInListProbeNode()",
          to ValueNode, as well. The default case returns "false", while
          BinaryRelationalOperatorNode returns true if it has a source IN-
          list associated with it:

          + /** @see ValueNode#isInListProbeNode */
          + public boolean isInListProbeNode()
          +

          { + return (inListProbeSource != null); + }

          Then I added a corresponding method called "isInListProbePredicate()"
          to Predicate.java. This method allows for simple (and relatively
          cheap) checking of a predicate to see if it is an IN-list probe
          predicate. There are several places in the code where we would
          attempt to retrieve the underlying source IN-list (via a call to
          "getSourceInList()") just to see if it was non-null. All of those
          occurrences have now been replaced by a call to the new method on
          Predicate.java. I think this is a cleaner and cheaper way to
          go about it.

          3 - Modifies Predicate.getSourceInList() to return the underlying
          InListOperatorNode for probe predicates AND for "normal"
          IN-list predicates (i.e. an IN-list that could not be
          transformed into a "probe predicate" because it contains
          one or more non-parameter, non-constant values)

          This then allowed for some cleanup of the code mentioned in
          Bryan's comment #2. Some of the logic for that code was
          specifically targeted for the old rewrite algorithm (use
          of a BETWEEN operator), so I fixed it up and added comments
          as I felt appropriate.

          I also added a second version of getSourceInList() that takes a
          boolean argument; if true, then it will only return the source
          IN list for a predicate if that predicate is an IN-list
          probe predicate.

          4 - Changes PredicateList.generateInListValues() to account for the
          fact that it only ever gets called when we know that there is
          a probe predicate in the list. This addresses Bryan's review
          comment #3.

          5 - Shortens a couple of lines in FromBaseTable that were added with
          earlier patches but were longer than 80 chars. Also rewrites
          one Sanity check in that class to avoid construction of strings
          when no error occurs (per recent discussions on derby-dev).

          I ran derybyall and suites.All with ibm142 on Red Hat Linux with no new failures. Feedback or further review of these changes is appreciated. I'll plan to commit on Monday if I don't hear any objections.

          Many many thanks again to Bryan for his time and suggestions!

          Show
          A B added a comment - Attaching d47_mp_cleanup_v1.patch, which is a patch to address the review comments made by Bryan on 11/Mar/07 11:37 AM. In particular this patch does the following: 1 - Changes Predicate.isRelationalOpPredicate() so that it just calls the already existing method "isRelationalOperator()" on the left operand of the predicate's AND node. I.e.: return ((getRelop() != null) && (getSourceInList() == null)); + return andNode.getLeftOperand().isRelationalOperator(); I completely forgot that the "isRelationalOperator()" method already existed, even though I myself made probe-predicate-based changes to that method as part of d47_mp_relOpPredCheck: public boolean isRelationalOperator() { - return true; + /* If this rel op is for a probe predicate then we do not call + * it a "relational operator"; it's actually a disguised IN-list + * operator. + */ + return (inListProbeSource == null); + } As a result of those changes we can now just call that method when checking to see if a predicate is a relational op predicate. This ultimately comes down to a simple check for a null variable in BinaryRelationalOperatorNode, as seen above. I believe this change addresses Bryan's comment #7, which pointed out that the old code: return ((getRelop() != null) && (getSourceInList() == null)); seemed a tad expensive since it was replacing a simple call to "relop != null". The new code (with this patch) is much more comparable to the "relop != null" check in terms of "work" that it does (we have an additional call to getLeftOperand(), but that's about it). 2 - Inspired by the "isRelationalOperator()" method defined in ValueNode and used above, I added a similar method, "isInListProbeNode()", to ValueNode, as well. The default case returns "false", while BinaryRelationalOperatorNode returns true if it has a source IN- list associated with it: + /** @see ValueNode#isInListProbeNode */ + public boolean isInListProbeNode() + { + return (inListProbeSource != null); + } Then I added a corresponding method called "isInListProbePredicate()" to Predicate.java. This method allows for simple (and relatively cheap) checking of a predicate to see if it is an IN-list probe predicate. There are several places in the code where we would attempt to retrieve the underlying source IN-list (via a call to "getSourceInList()") just to see if it was non-null. All of those occurrences have now been replaced by a call to the new method on Predicate.java. I think this is a cleaner and cheaper way to go about it. 3 - Modifies Predicate.getSourceInList() to return the underlying InListOperatorNode for probe predicates AND for "normal" IN-list predicates (i.e. an IN-list that could not be transformed into a "probe predicate" because it contains one or more non-parameter, non-constant values) This then allowed for some cleanup of the code mentioned in Bryan's comment #2. Some of the logic for that code was specifically targeted for the old rewrite algorithm (use of a BETWEEN operator), so I fixed it up and added comments as I felt appropriate. I also added a second version of getSourceInList() that takes a boolean argument; if true, then it will only return the source IN list for a predicate if that predicate is an IN-list probe predicate. 4 - Changes PredicateList.generateInListValues() to account for the fact that it only ever gets called when we know that there is a probe predicate in the list. This addresses Bryan's review comment #3. 5 - Shortens a couple of lines in FromBaseTable that were added with earlier patches but were longer than 80 chars. Also rewrites one Sanity check in that class to avoid construction of strings when no error occurs (per recent discussions on derby-dev). I ran derybyall and suites.All with ibm142 on Red Hat Linux with no new failures. Feedback or further review of these changes is appreciated. I'll plan to commit on Monday if I don't hear any objections. Many many thanks again to Bryan for his time and suggestions!
          Hide
          Bryan Pendleton added a comment -

          Thanks for all the attention to detail, Army! The mp_cleanup_v1 patch looks very clean to me.

          Show
          Bryan Pendleton added a comment - Thanks for all the attention to detail, Army! The mp_cleanup_v1 patch looks very clean to me.
          Hide
          A B added a comment -

          Thank you for the review of the cleanup patch, Bryan! I will continue with my plan to commit that patch before the end of day on Monday if no other comments come in.

          I'm also attaching here a (final?) patch for this issue: d47_mp_junitTest_v1.patch, which creates a new JUnit test based on the repro program attached to this issue (thanks James Synge!). The test creates the same kind of table and data that Derby47PerformanceTest.java creates, and then runs three types of queries with larger and larger IN lists. The three types of queries are:

          1 - "Markers" : same as in James' program
          2 - "Literals" : same as in James' program
          3 - 'MixedIds": IN list has a combination of parameter markers and literals.

          For each query we check to make sure the results are correct and then we look at the query plan to determine whether or not the optimizer chose to do multi-probing. If the results are incorrect or if the optimizer did not choose multi-probing then the test will fail.

          The test determines that "multi-probing" was in effect by looking at the query plan and verifying two things:

          1. We did an index scan on the target table AND

          2. The number of rows that "qualified" is equal to the number of rows that were actually returned for the query. If we did not do multi-probing then we would scan all or part of the index and then apply the IN-list restriction after reading the rows. That means that the number of rows "qualified" for the scan would be greater than the number of rows returned from the query. But if we do multi-probing we will just probe for rows that we know satsify the restriction, thus the number of rows that we "fetch" for the scan (i.e. "rows qualified") should exactly match the number of rows in the result set.

          I ran the new test using ibm142, ibm15, jdk142, jdk15, jdk16, and weme6.1 and it passed in all cases.

          I also ran the new test against a build that I created before any of the DERBY-47 changes were committed; as expected, the test failed because even though the optimizer did chose to use an index, it scanned a lot (thousands) of extra rows for that index.

          Any reviews/comments on the this new JUnit test are very much welcomed. In the absence of any feedback to the contrary, I'm thinking I'll commit this new test by the end of Monday, as well. And of course, comments/suggestions can still be made after that, if needed...

          Show
          A B added a comment - Thank you for the review of the cleanup patch, Bryan! I will continue with my plan to commit that patch before the end of day on Monday if no other comments come in. I'm also attaching here a (final?) patch for this issue: d47_mp_junitTest_v1.patch, which creates a new JUnit test based on the repro program attached to this issue (thanks James Synge!). The test creates the same kind of table and data that Derby47PerformanceTest.java creates, and then runs three types of queries with larger and larger IN lists. The three types of queries are: 1 - "Markers" : same as in James' program 2 - "Literals" : same as in James' program 3 - 'MixedIds": IN list has a combination of parameter markers and literals. For each query we check to make sure the results are correct and then we look at the query plan to determine whether or not the optimizer chose to do multi-probing. If the results are incorrect or if the optimizer did not choose multi-probing then the test will fail. The test determines that "multi-probing" was in effect by looking at the query plan and verifying two things: 1. We did an index scan on the target table AND 2. The number of rows that "qualified" is equal to the number of rows that were actually returned for the query. If we did not do multi-probing then we would scan all or part of the index and then apply the IN-list restriction after reading the rows. That means that the number of rows "qualified" for the scan would be greater than the number of rows returned from the query. But if we do multi-probing we will just probe for rows that we know satsify the restriction, thus the number of rows that we "fetch" for the scan (i.e. "rows qualified") should exactly match the number of rows in the result set. I ran the new test using ibm142, ibm15, jdk142, jdk15, jdk16, and weme6.1 and it passed in all cases. I also ran the new test against a build that I created before any of the DERBY-47 changes were committed; as expected, the test failed because even though the optimizer did chose to use an index, it scanned a lot (thousands) of extra rows for that index. Any reviews/comments on the this new JUnit test are very much welcomed. In the absence of any feedback to the contrary, I'm thinking I'll commit this new test by the end of Monday, as well. And of course, comments/suggestions can still be made after that, if needed...
          Hide
          Bryan Pendleton added a comment -

          InListMultiProbeTest applied and built and passed in my environment. Looks good!

          Show
          Bryan Pendleton added a comment - InListMultiProbeTest applied and built and passed in my environment. Looks good!
          Hide
          A B added a comment -

          Thank you, Bryan, for taking a look at the new JUnit test and for verifying that it runs.

          I committed the cleanup patch and the new JUnit tests with the following svn commits (respectively):

          URL: http://svn.apache.org/viewvc?view=rev&rev=520188
          URL: http://svn.apache.org/viewvc?view=rev&rev=520191

          I'm marking the issue as "Resolved" in 10.3 since I believe this wraps up the changes for this issue. I plan to run some (simple) before-and-after numbers and post them tomorrow.

          I will wait a few days to check for fallout and then I will close or re-open this issue accordingly.

          For any of the people "watching" this issue, if you are willing and able to sync up with the latest trunk for testing only (do not use the trunk in production, nor on production databases!) and provide feedback on whether or not the changes for this issue address your concerns, that'd be great. If you are still experiencing problems even after these changes, you may need to file one or more additional Jira issues to address those specific problems. As it is, I think the work for this specific Jira issue is pretty much "done"...

          Follow-up comments and feedback are of course always-and greatly-welcomed.

          Show
          A B added a comment - Thank you, Bryan, for taking a look at the new JUnit test and for verifying that it runs. I committed the cleanup patch and the new JUnit tests with the following svn commits (respectively): URL: http://svn.apache.org/viewvc?view=rev&rev=520188 URL: http://svn.apache.org/viewvc?view=rev&rev=520191 I'm marking the issue as "Resolved" in 10.3 since I believe this wraps up the changes for this issue. I plan to run some (simple) before-and-after numbers and post them tomorrow. I will wait a few days to check for fallout and then I will close or re-open this issue accordingly. For any of the people "watching" this issue, if you are willing and able to sync up with the latest trunk for testing only (do not use the trunk in production, nor on production databases!) and provide feedback on whether or not the changes for this issue address your concerns, that'd be great. If you are still experiencing problems even after these changes, you may need to file one or more additional Jira issues to address those specific problems. As it is, I think the work for this specific Jira issue is pretty much "done"... Follow-up comments and feedback are of course always- and greatly -welcomed.
          Hide
          A B added a comment -

          Attaching a very simple document with some straightforward "before-and-after" measurements based on the Derby47PerformanceTest attached to this issue. I wrote this document pretty quickly so it's not very elegant, but it does show the improvement that I see from the DERBY-47 changes. Here's the conclusion, pasted from the document:

          <begin paste>

          Conclusion:

          From a "multi-probing" perspective these numbers show what we expect: namely, that we can save a lot of time by selectively probing an index for a list of values instead of scanning all (or large parts) of that index for a relatively small number of rows.

          But there are a couple of areas that could use follow-up work. In particular:

          1. As seen in this document, the compilation/optimization time for "Literals" is far larger than it is for "Markers". Since the "probe predicate" optimization in theory applies the same to both strategies, further investigation is needed to figure out what it is about "Literals" that makes for such a long compilation time. I do not currently have any theories as to what could be at the root of this. Note, though, that this relatively excessive compilation time was an issue even before the changes for DERBY-47 were committed.

          2. Not surprisingly, the costing logic for probing as implemented in DERBY-47 is not perfect. It works great in situations where the IN list is significantly smaller than the number of rows in the table--ex. we see good results for 100k rows. However, I discovered that if we just run with 10,000 rows, then once we hit 1,000 values in the IN list the costing of probe predicates causes the optimizer to think that it would be too expensive, so it (the optimizer) ends up doing a table scan. In truth the table scan is still far slower than index probing, but the relative size of the IN list with respect to the number of rows in the table throws the costing off. To confirm this I just removed the probing cost logic (so that it effectively becomes the cost of a single "col = ?" predicate) and then the optimizer chose to do index probing for the 10,000 row scenario, which was much, much faster (as expected).

          My feeling is that any work related to investigation/resolution of these two issues can and should be completed as part of a new Jira...
          <end paste>

          Show
          A B added a comment - Attaching a very simple document with some straightforward "before-and-after" measurements based on the Derby47PerformanceTest attached to this issue. I wrote this document pretty quickly so it's not very elegant, but it does show the improvement that I see from the DERBY-47 changes. Here's the conclusion, pasted from the document: <begin paste> Conclusion: From a "multi-probing" perspective these numbers show what we expect: namely, that we can save a lot of time by selectively probing an index for a list of values instead of scanning all (or large parts) of that index for a relatively small number of rows. But there are a couple of areas that could use follow-up work. In particular: 1. As seen in this document, the compilation/optimization time for "Literals" is far larger than it is for "Markers". Since the "probe predicate" optimization in theory applies the same to both strategies, further investigation is needed to figure out what it is about "Literals" that makes for such a long compilation time. I do not currently have any theories as to what could be at the root of this. Note, though, that this relatively excessive compilation time was an issue even before the changes for DERBY-47 were committed. 2. Not surprisingly, the costing logic for probing as implemented in DERBY-47 is not perfect. It works great in situations where the IN list is significantly smaller than the number of rows in the table--ex. we see good results for 100k rows. However, I discovered that if we just run with 10,000 rows, then once we hit 1,000 values in the IN list the costing of probe predicates causes the optimizer to think that it would be too expensive, so it (the optimizer) ends up doing a table scan. In truth the table scan is still far slower than index probing, but the relative size of the IN list with respect to the number of rows in the table throws the costing off. To confirm this I just removed the probing cost logic (so that it effectively becomes the cost of a single "col = ?" predicate) and then the optimizer chose to do index probing for the 10,000 row scenario, which was much, much faster (as expected). My feeling is that any work related to investigation/resolution of these two issues can and should be completed as part of a new Jira... <end paste>
          Hide
          A B added a comment -

          Updated copy of the before-and-after numbers. This one actually tells what the numbers in the tables mean

          Show
          A B added a comment - Updated copy of the before-and-after numbers. This one actually tells what the numbers in the tables mean
          Hide
          Bryan Pendleton added a comment -

          Thanks for posting the measurement analysis, Army; it's good to see the numbers backing up
          the results we hoped to see.

          I agree that the two follow-up issues that you uncovered should be logged and pursued separately.

          Show
          Bryan Pendleton added a comment - Thanks for posting the measurement analysis, Army; it's good to see the numbers backing up the results we hoped to see. I agree that the two follow-up issues that you uncovered should be logged and pursued separately.
          Hide
          Laura Stewart added a comment -

          Army - Just FYI
          In the Tuning Guide, the IN operator is mentioned in the optimization and query sections, as shown below:

          DML statements and performance
          Performance and optimization
          Index use and access paths
          Joins and performance
          Derby's cost-based optimization
          Locking and performance
          Non-cost-based optimizations
          Non-cost-based sort avoidance (tuple filtering)
          The MIN() and MAX() optimizations
          Overriding the default optimizer behavior

          Internal language transformations
          Predicate transformations
          Static IN predicate transformations
          Subquery processing and transformations
          DISTINCT elimination in IN, ANY, and EXISTS subqueries
          IN/ANY subquery transformation

          It is not clear to me if the Optimization section needs to be updated with any info about improvements to the IN optimization.
          But I wanted to provide you with this info so that you can decide.

          Show
          Laura Stewart added a comment - Army - Just FYI In the Tuning Guide, the IN operator is mentioned in the optimization and query sections, as shown below: DML statements and performance Performance and optimization Index use and access paths Joins and performance Derby's cost-based optimization Locking and performance Non-cost-based optimizations Non-cost-based sort avoidance (tuple filtering) The MIN() and MAX() optimizations Overriding the default optimizer behavior Internal language transformations Predicate transformations Static IN predicate transformations Subquery processing and transformations DISTINCT elimination in IN, ANY, and EXISTS subqueries IN/ANY subquery transformation It is not clear to me if the Optimization section needs to be updated with any info about improvements to the IN optimization. But I wanted to provide you with this info so that you can decide.
          Hide
          Andrew McIntyre added a comment -

          Hi Army, I just checked in a converted JUnit test for the old distinct.sql. All the fixtures in the test had been working, until I updated to get the new RuntimeStatisticsParser so I could use it in the test, and then one fixture started failing with an ASSERT related to DERBY-47. I was hoping if you have some spare time that you might be able to take a look at it and see if you can figure out what's going on a little quicker than I. Look for and uncomment the testResultSetInOrderWhenUsingIndex() in the new DistinctTest class. Pardon the confusing name, its taken directly from the comment that proceeds the old test in the .sql file, so I kept it as the method name for the test fixture.

          The really confusing thing to me is why the identical 'prepare as ...' with an identical select statement isn't getting the assert when the .sql test is running under the old harness. I've probably just missed something subtle from the old test, and maybe another pair of eyes will help me spot what that is.

          Show
          Andrew McIntyre added a comment - Hi Army, I just checked in a converted JUnit test for the old distinct.sql. All the fixtures in the test had been working, until I updated to get the new RuntimeStatisticsParser so I could use it in the test, and then one fixture started failing with an ASSERT related to DERBY-47 . I was hoping if you have some spare time that you might be able to take a look at it and see if you can figure out what's going on a little quicker than I. Look for and uncomment the testResultSetInOrderWhenUsingIndex() in the new DistinctTest class. Pardon the confusing name, its taken directly from the comment that proceeds the old test in the .sql file, so I kept it as the method name for the test fixture. The really confusing thing to me is why the identical 'prepare as ...' with an identical select statement isn't getting the assert when the .sql test is running under the old harness. I've probably just missed something subtle from the old test, and maybe another pair of eyes will help me spot what that is.
          Hide
          A B added a comment -

          Hi Andrew,

          Thank you for reporting this! My guess is that the difference between the JUnit test and the old distinct.sql is the specification of certain properties for the old test--namely:

          derby.optimizer.optimizeJoinOrder=false
          derby.optimizer.noTimeout=true
          derby.optimizer.ruleBasedOptimization=true

          When I specified these three properties on the command line for the JUnit test, all fixtures ran as expected. I then narrowed the failure that you're seeing down to the "optimizeJoinOrder" property: if it's not specified--i.e. if optimization of join orders occurs "as normal", the test fails. It only passes if optimizeJoinOrder is false.

          So that's good news for you--you know why the problem is occuring. And I think it means you found a bug somewhere with DERBY-47, too...so thank you!

          I will look into this...

          Show
          A B added a comment - Hi Andrew, Thank you for reporting this! My guess is that the difference between the JUnit test and the old distinct.sql is the specification of certain properties for the old test--namely: derby.optimizer.optimizeJoinOrder=false derby.optimizer.noTimeout=true derby.optimizer.ruleBasedOptimization=true When I specified these three properties on the command line for the JUnit test, all fixtures ran as expected. I then narrowed the failure that you're seeing down to the "optimizeJoinOrder" property: if it's not specified--i.e. if optimization of join orders occurs "as normal", the test fails. It only passes if optimizeJoinOrder is false. So that's good news for you--you know why the problem is occuring. And I think it means you found a bug somewhere with DERBY-47 , too...so thank you! I will look into this...
          Hide
          Andrew McIntyre added a comment -

          Hah! I had already deleted all the properties files for the old tests in my client so it didn't occur to me to look there. A fortunate oversight, then, I suppose.

          Two questions then

          1. shall I open a new issue for the test failure? If so, I'll be glad to, or we can roll it in to DERBY-2491.

          2. should I add the other two properties to the new test run? Adding the noTimeout flag doesn't significantly increase the amount of time the test takes to run. I'm not clear on what the ruleBasedOptimization flag does. Does that instruct the optimizer to not attempt the normal cost-based optimization and only do some rule-based optimization? The test runs (and passes) in a normal amount of time with or without this property set and noTimeout = true.

          Show
          Andrew McIntyre added a comment - Hah! I had already deleted all the properties files for the old tests in my client so it didn't occur to me to look there. A fortunate oversight, then, I suppose. Two questions then 1. shall I open a new issue for the test failure? If so, I'll be glad to, or we can roll it in to DERBY-2491 . 2. should I add the other two properties to the new test run? Adding the noTimeout flag doesn't significantly increase the amount of time the test takes to run. I'm not clear on what the ruleBasedOptimization flag does. Does that instruct the optimizer to not attempt the normal cost-based optimization and only do some rule-based optimization? The test runs (and passes) in a normal amount of time with or without this property set and noTimeout = true.
          Hide
          A B added a comment -

          > 1. shall I open a new issue for the test failure?

          Yes, I think that'd be good. Note though that this is an actual engine bug, not just a test problem. So it'd be good to make that clear in the new issue. The test case can serve as a repro for the engine bug.

          > 2. should I add the other two properties to the new test run?

          First I should mention that the full list of properties for this test includes:

          derby.infolog.append=true
          derby.optimizer.optimizeJoinOrder=false
          derby.optimizer.noTimeout=true
          derby.optimizer.ruleBasedOptimization=true
          derby.language.statementCacheSize=1000

          As for which of these are required, ummm....I don't know. My assumption is that the properties were added for specific reasons, but without corresponding comments it's hard to tell. Generally speaking any test for which we need to verify all or part of a query plan uses "noTimeout=true" to ensure that we get the same plan regardless of platform or machine speed. So it might be worth it to keep that one. But it'll have to be your call for the rest of them...

          > I'm not clear on what the ruleBasedOptimization flag does. Does that instruct the optimizer to not attempt the normal
          > cost-based optimization and only do some rule-based optimization?

          Yes, that's exactly what it does. But that said, I know practically nothing about rule-based optimization, and I don't know why that particular property was added for the old distinct.sql test. Maybe you can do some investigating there to figure it out....and if not...well, like I said, it's your call

          Show
          A B added a comment - > 1. shall I open a new issue for the test failure? Yes, I think that'd be good. Note though that this is an actual engine bug, not just a test problem. So it'd be good to make that clear in the new issue. The test case can serve as a repro for the engine bug. > 2. should I add the other two properties to the new test run? First I should mention that the full list of properties for this test includes: derby.infolog.append=true derby.optimizer.optimizeJoinOrder=false derby.optimizer.noTimeout=true derby.optimizer.ruleBasedOptimization=true derby.language.statementCacheSize=1000 As for which of these are required, ummm....I don't know. My assumption is that the properties were added for specific reasons, but without corresponding comments it's hard to tell. Generally speaking any test for which we need to verify all or part of a query plan uses "noTimeout=true" to ensure that we get the same plan regardless of platform or machine speed. So it might be worth it to keep that one. But it'll have to be your call for the rest of them... > I'm not clear on what the ruleBasedOptimization flag does. Does that instruct the optimizer to not attempt the normal > cost-based optimization and only do some rule-based optimization? Yes, that's exactly what it does. But that said, I know practically nothing about rule-based optimization, and I don't know why that particular property was added for the old distinct.sql test. Maybe you can do some investigating there to figure it out....and if not...well, like I said, it's your call
          Hide
          Andrew McIntyre added a comment -

          Army> Note though that this is an actual engine bug, not just a test problem. So it'd be good to make that clear in the new issue.

          Agreed. I have a (bad?) tendency to overload JIRA issues by fixing everything that comes along whilst working on that one issue. Opened DERBY-2500 for this new one.

          As for the other properties you listed, I can't see how any of those would affect how the test runs, with the exception of noTimeout and ruleBasedOptimization, since they might affect the query plans in the runtime statistics. But, since the tests complete in a reasonable amount of time with noTimeout=true I'll leave that in, and since all the tests pass with or without ruleBasedOptimization set to true, I'll leave that off. Not really worth investigating, since it seems to have no effect on the new test.

          Show
          Andrew McIntyre added a comment - Army> Note though that this is an actual engine bug, not just a test problem. So it'd be good to make that clear in the new issue. Agreed. I have a (bad?) tendency to overload JIRA issues by fixing everything that comes along whilst working on that one issue. Opened DERBY-2500 for this new one. As for the other properties you listed, I can't see how any of those would affect how the test runs, with the exception of noTimeout and ruleBasedOptimization, since they might affect the query plans in the runtime statistics. But, since the tests complete in a reasonable amount of time with noTimeout=true I'll leave that in, and since all the tests pass with or without ruleBasedOptimization set to true, I'll leave that off. Not really worth investigating, since it seems to have no effect on the new test.
          Hide
          A B added a comment -

          The issue filed by Andrew (DERBY-2500) has been resolved with no apparent fallout. I have also filed several Jiras for follow-up enhancements that can be done in the future, namely DERBY-2482, DERBY-2483, and DERBY-2503.

          As I have not heard anything (good nor bad) from the people "watching" this issue, and since other Jiras exist for related work, I think we should be able to close this issue now.

          Sunitha, if you agree can you mark this Jira as closed?

          Show
          A B added a comment - The issue filed by Andrew ( DERBY-2500 ) has been resolved with no apparent fallout. I have also filed several Jiras for follow-up enhancements that can be done in the future, namely DERBY-2482 , DERBY-2483 , and DERBY-2503 . As I have not heard anything (good nor bad) from the people "watching" this issue, and since other Jiras exist for related work, I think we should be able to close this issue now. Sunitha, if you agree can you mark this Jira as closed?
          Hide
          Sunitha Kambhampati added a comment -

          Thanks Army for all your great work on this. Closing this issue now.

          Show
          Sunitha Kambhampati added a comment - Thanks Army for all your great work on this. Closing this issue now.

            People

            • Assignee:
              A B
              Reporter:
              Sunitha Kambhampati
            • Votes:
              8 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development