Derby
  1. Derby
  2. DERBY-2752

PreparedStatement with parameters uses different (more worse) access path than Statement with constants

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Won't Fix
    • Affects Version/s: 10.2.2.0
    • Fix Version/s: None
    • Component/s: JDBC, SQL
    • Labels:
      None
    • Environment:
      Windows XP, JDK 1.4.2
    • Urgency:
      Normal
    • Bug behavior facts:
      Performance

      Description

      On our application we are executing the SQL below via a PreparedStatement. The execution time is minutes.
      When trying to reproduce the problem with the DBVisualizer the performance was good.
      After writing a small test program (see attachement) I found out that the access path with PreparedStatement is different to the one of Statement:
      Hash Scan ResultSet for ESVN01 using index IESVN01A at read committed isolation level using instantaneous share row locking:
      (optimizer estimated cost: 2773,96 but Rows seen = 108014449)
      but better was
      Hash Scan ResultSet for ESVN01 using constraint SQL070530065000411 at read committed isolation level using instantaneous share row locking:
      (optimizer estimated cost: 2791,96 and Rows seen = 10393)

      Our solution was to remove the index IESVN01A and so force the PreparedStatement not to use it (as the Statement already does).

      With the removed index there are still different access paths
      (PreparedStatement uses an additional "Index Row to Base Row ResultSet for ESVN01" before "Hash Scan ResultSet for ESVN01")
      See differenced between StatementwithoutIndexIESVN01A.txt and PreparedStatementwithoutIndexIESVN01A.txt

      As it is not clear for us if in some other cases the PreparedStatementOptimizer fails this issue should be solved anyway.

      — SQL with values: —
      SELECT O4Work.ESVN01.NU_BUY_CPY AS PO_BuyCompanyNo, O4Work.ESVN01.NU_ODR AS PO_Number, O4Work.ESVN01.NA_SEN AS PO_ShortName, O4Work.ESVN01.FL_ODR_CAE AS PO_Type, O4Work.ESVN01.NU_MCS_SPY AS PO_SupplierNo, O4Work.ESVN01.NU_ST3 AS PO_StatusNo, O4Work.ESVN01.DA_SPY_COY_PRT AS PO_SCPrintDate, O4Work.ESVN01.FL_SAS AS PO_SeasFlag, O4Work.ESVNA5.ID_PTE AS POPA_PictureID, O4Work.ESVNG3.NU_ODR AS ON_ID, O4Work.ESVN02.NU_PST AS POP_Position_Id, O4Work.ESVN02.NU_CTT AS POP_ContractNo, O4Work.ESVN02.NU_ARO_CTT AS POP_ArosContractNo, O4Work.ESVN02.NU_ST3 AS POP_StatusNo, O4Work.ESVN02.DA_CAE AS POP_CreationDate, O4Work.ESVN02.NA_CAE_USE AS POP_CreationUID, O4Work.ESVN02.DA_LAT_AMD AS POP_LastAmendDate, O4Work.ESVN02.NA_LAT_AMD_USE AS POP_LastAmendUID, O4Work.ESVN02.NA_SPY_STL AS POP_SuppStyle, O4Work.ESVNA0.NU_SSN_IDE AS POPD_SeasonInd, O4Work.ESVNA0.NA_PDE AS POPD_ProductName, O4Work.ESVNA0.NA_PDE_VSN AS POPD_ProdVerName, O4Work.ESVNA0.NU_STL_ID1 AS POPD_StyleId, O4Work.ESVNA0.NU_SRY_ID1 AS POPD_StoryID,
      O4Work.ESVNA0.NU_LC1 AS POPD_LicenseID, O4Work.ESVNA0.NA_DSN_ID1 AS POPD_DesignID, O4Work.ESVNA4.NA_HAG AS POPDH_HangerName, O4Work.ESVP00.NU_CSY AS SER_ClassNo, O4Work.ESVP00.NU_COE AS SER_CodeNo, O4Work.ESVP00.NU_SRL AS SER_SerialNo, O4Work.ESVP00.NU_PIK_MOD AS SER_PickingM, O4Work.ESVN03.NU_MT1_CPY AS POPC_MasterCpyNo, O4Work.ESVN03.QU_ODR AS POPC_OrderedQty, O4Work.ESVN03.DA_EDD AS POPC_Edd, O4Work.ESVN03.DA_LDD AS POPC_Ldd, O4Work.ESVN03.DA_PAD AS POPC_Pad, O4Work.ESVN03.DA_SAD AS POPC_Sad, O4Work.ESVN03.PR_SCP AS POPC_SupCstPrice, O4Work.ESVN03.NU_SCP_CR1 AS POPC_SupCstPrCurr, O4Work.ESVN03.NU_ST3 AS POPC_StatusNo, O4Work.ESVN03.NU_COY_FRM_ODR AS POPC_Src_PO_Number, O4Work.ESVN03.NU_COY_UTL_ODR AS POPC_Tgt_PO_Number, O4Work.ESVN03.DA_FLR_RDY AS POPC_FRM_DATE, O4Work.ESVN03.FL_CSG AS POPC_CS_FLAG, O4Work.ESVN03.NU_PAK_MOD_SPY AS POPC_PackSupplNo, O4Work.ESVN03.NU_PAK_MOD_DCR AS POPC_PackingDCNo, O4Work.ESVN03.NU_PS2_MOD AS POPC_PresMethodNo, O4Work.ESVN04.NU_RTL_CPY AS POPRC_RetailCode,
      O4Work.ESVN04.PR_PLN_SEL AS POPRC_SellPrice, O4Work.ESVN08.NU_AVE AS POPRCA_AdvertNo, O4Work.ESVQ00.ID_SHP AS SHP_ShippingID, O4Work.ESVQ00.NU_SHP AS SHP_ShippingNo, O4Work.ESVNB0.NU_NTL_PDE_ID1 AS POPDC_NationalID, O4Work.ESVNB0.NU_EQP AS POPDC_EquipNumber, O4Work.ESVNE1.PE_OMU AS POPCC_OMU, O4Work.ESVN07.FL_ALE_RMK AS POPCU_AllocRem, O4Work.ESVN0A.DA_TE1_RQT AS POPCT_TicketReq FROM O4Work.ESVN02 INNER JOIN O4Work.ESVN01 ON O4Work.ESVN02.NU_BUY_CPY = O4Work.ESVN01.NU_BUY_CPY AND O4Work.ESVN02.NU_ODR = O4Work.ESVN01.NU_ODR LEFT OUTER JOIN O4Work.ESVNG3 ON O4Work.ESVN01.NU_BUY_CPY = O4Work.ESVNG3.NU_BUY_CPY AND O4Work.ESVN01.NU_ODR = O4Work.ESVNG3.NU_ODR INNER JOIN O4Work.ESVNA0 ON O4Work.ESVN02.NU_BUY_CPY = O4Work.ESVNA0.NU_BUY_CPY AND O4Work.ESVN02.NU_ODR = O4Work.ESVNA0.NU_ODR AND O4Work.ESVN02.NU_PST = O4Work.ESVNA0.NU_PST INNER JOIN O4Work.ESVP00 ON O4Work.ESVNA0.ID_SRL = O4Work.ESVP00.ID_SRL LEFT OUTER JOIN O4Work.ESVNA4 ON O4Work.ESVNA0.NU_BUY_CPY = O4Work.ESVNA4.NU_BUY_CPY AND
      O4Work.ESVNA0.NU_ODR = O4Work.ESVNA4.NU_ODR AND O4Work.ESVNA0.NU_PST = O4Work.ESVNA4.NU_PST LEFT OUTER JOIN O4Work.ESVNA5 ON O4Work.ESVNA0.NU_BUY_CPY = O4Work.ESVNA5.NU_BUY_CPY AND O4Work.ESVNA0.NU_ODR = O4Work.ESVNA5.NU_ODR AND O4Work.ESVNA0.NU_PST = O4Work.ESVNA5.NU_PST INNER JOIN O4Work.ESVN03 ON O4Work.ESVN02.NU_BUY_CPY = O4Work.ESVN03.NU_BUY_CPY AND O4Work.ESVN02.NU_ODR = O4Work.ESVN03.NU_ODR AND O4Work.ESVN02.NU_PST = O4Work.ESVN03.NU_PST LEFT OUTER JOIN O4Work.ESVN04 ON O4Work.ESVN03.NU_BUY_CPY = O4Work.ESVN04.NU_BUY_CPY AND O4Work.ESVN03.NU_MT1_CPY = O4Work.ESVN04.NU_MT1_CPY AND O4Work.ESVN03.NU_ODR = O4Work.ESVN04.NU_ODR AND O4Work.ESVN03.NU_PST = O4Work.ESVN04.NU_PST LEFT OUTER JOIN O4Work.ESVN08 ON O4Work.ESVN04.NU_BUY_CPY = O4Work.ESVN08.NU_BUY_CPY AND O4Work.ESVN04.NU_RTL_CPY = O4Work.ESVN08.NU_RTL_CPY AND O4Work.ESVN04.NU_MT1_CPY = O4Work.ESVN08.NU_MT1_CPY AND O4Work.ESVN04.NU_ODR = O4Work.ESVN08.NU_ODR AND O4Work.ESVN04.NU_PST = O4Work.ESVN08.NU_PST INNER JOIN O4Work.ESVQ00 ON
      O4Work.ESVN03.ID_SHP = O4Work.ESVQ00.ID_SHP INNER JOIN O4Work.ESVNB0 ON O4Work.ESVN03.NU_BUY_CPY = O4Work.ESVNB0.NU_BUY_CPY AND O4Work.ESVN03.NU_MT1_CPY = O4Work.ESVNB0.NU_MT1_CPY AND O4Work.ESVN03.NU_ODR = O4Work.ESVNB0.NU_ODR AND O4Work.ESVN03.NU_PST = O4Work.ESVNB0.NU_PST INNER JOIN O4Work.ESVNE1 ON O4Work.ESVN03.NU_BUY_CPY = O4Work.ESVNE1.NU_BUY_CPY AND O4Work.ESVN03.NU_MT1_CPY = O4Work.ESVNE1.NU_MT1_CPY AND O4Work.ESVN03.NU_ODR = O4Work.ESVNE1.NU_ODR AND O4Work.ESVN03.NU_PST = O4Work.ESVNE1.NU_PST LEFT OUTER JOIN O4Work.ESVN0A ON O4Work.ESVN03.NU_BUY_CPY = O4Work.ESVN0A.NU_BUY_CPY AND O4Work.ESVN03.NU_MT1_CPY = O4Work.ESVN0A.NU_MT1_CPY AND O4Work.ESVN03.NU_ODR = O4Work.ESVN0A.NU_ODR AND O4Work.ESVN03.NU_PST = O4Work.ESVN0A.NU_PST LEFT OUTER JOIN O4Work.ESVN07 ON O4Work.ESVN03.NU_BUY_CPY = O4Work.ESVN07.NU_BUY_CPY AND O4Work.ESVN03.NU_MT1_CPY = O4Work.ESVN07.NU_MT1_CPY AND O4Work.ESVN03.NU_ODR = O4Work.ESVN07.NU_ODR AND O4Work.ESVN03.NU_PST = O4Work.ESVN07.NU_PST
      WHERE O4Work.ESVN01.NU_BUY_CPY = 99 AND O4Work.ESVNA0.NU_CPY_GRP = 0 AND O4Work.ESVNE1.NU_COT_TYP = 1 AND O4Work.ESVP00.NU_CSY = 603 AND O4Work.ESVNA0.NU_SSN_IDE = 10

      1. Statement.txt
        32 kB
        Stefan Cordes
      2. PreparedStatementwithoutIndexIESVN01A.txt
        32 kB
        Stefan Cordes
      3. PreparedStatement.txt
        32 kB
        Stefan Cordes
      4. derby-sql-26965.zip
        8.48 MB
        Stefan Cordes
      5. DerbySQL26965.java
        14 kB
        Stefan Cordes

        Activity

        Hide
        Rick Hillegas added a comment -

        Re-open to change closed status.

        Show
        Rick Hillegas added a comment - Re-open to change closed status.
        Hide
        Rick Hillegas added a comment -

        I am closing this issue for the following reasons:

        1) It has low priority.

        2) There is a workaround.

        3) There has been no activity on this issue for more than a year.

        4) The general behavior is expected: the optimizer can pick a better plan if you give it constants. The optimizer can give you a worse plan if all you give it are ? variables.

        Show
        Rick Hillegas added a comment - I am closing this issue for the following reasons: 1) It has low priority. 2) There is a workaround. 3) There has been no activity on this issue for more than a year. 4) The general behavior is expected: the optimizer can pick a better plan if you give it constants. The optimizer can give you a worse plan if all you give it are ? variables.
        Hide
        claes larsson added a comment -

        Hi,

        The same issue is raised (i.e. the prepared statement is using the wrong index and performance suffers)
        here

        http://www.nabble.com/Statement-PreparedStatement-performance-difference-and-problem-tt16537511.html#a16763534

        It can be be demonstrated in a small selfcontained program.

        Thanks for listening...

        Show
        claes larsson added a comment - Hi, The same issue is raised (i.e. the prepared statement is using the wrong index and performance suffers) here http://www.nabble.com/Statement-PreparedStatement-performance-difference-and-problem-tt16537511.html#a16763534 It can be be demonstrated in a small selfcontained program. Thanks for listening...
        Hide
        Stefan Cordes added a comment -

        Sure DB2 z/OS can handle SQLs containing "--". My fault.

        I meant the "access path optimizations" mentioned in this issue.

        Show
        Stefan Cordes added a comment - Sure DB2 z/OS can handle SQLs containing "--". My fault. I meant the "access path optimizations" mentioned in this issue.
        Hide
        A B added a comment -

        Thank you for the reply, Stefan.

        > The optimization hints for Derby a quite nice but as our application is intended to run on z/OS
        > DB2 (Online) and Derby (Offline) without major changes, the "--DERBY-PROPERTIES" would
        > be the last option to choose for us.

        If z/OS supports SQL "simple comments" that begin with "-" (which is SQL 2003 standard, section 5.2), then I think that Derby's optimizer hints should be treated as normal SQL comments by z/OS: i.e. they should be ignored. So in theory a query which includes a proper "-DERBY-PROPERTIES" hint should run without error in any SQL engine that supports SQL "simple comments". Have you tried executing an SQL statement that has Derby optimizer hints against z/OS? If it fails, I'd be curious to know what error you are seeing?

        > As we have several possibilities now to influence the access path

        Are you referring to the possibilities that have already been mentioned in this issue, or to something else? If you know of other ways to influence the access path, is that something you are willing/able to share, in case other users of Derby have similar problems now or in the future?

        Thanks again for your feedback.

        Show
        A B added a comment - Thank you for the reply, Stefan. > The optimization hints for Derby a quite nice but as our application is intended to run on z/OS > DB2 (Online) and Derby (Offline) without major changes, the "--DERBY-PROPERTIES" would > be the last option to choose for us. If z/OS supports SQL "simple comments" that begin with "- " (which is SQL 2003 standard, section 5.2), then I think that Derby's optimizer hints should be treated as normal SQL comments by z/OS: i.e. they should be ignored. So in theory a query which includes a proper " -DERBY-PROPERTIES" hint should run without error in any SQL engine that supports SQL "simple comments". Have you tried executing an SQL statement that has Derby optimizer hints against z/OS? If it fails, I'd be curious to know what error you are seeing? > As we have several possibilities now to influence the access path Are you referring to the possibilities that have already been mentioned in this issue, or to something else? If you know of other ways to influence the access path, is that something you are willing/able to share, in case other users of Derby have similar problems now or in the future? Thanks again for your feedback.
        Hide
        Stefan Cordes added a comment -

        Hi A B!
        I agree that these are different queries from optimizers point of view.

        The optimization hints for Derby a quite nice but as our application is intended to run on z/OS DB2 (Online) and Derby (Offline)
        without major changes, the "--DERBY-PROPERTIES" would be the last option to choose for us.

        As we have several possibilities now to influence the access path I changed the prio of this issue to "Minor"

        Show
        Stefan Cordes added a comment - Hi A B! I agree that these are different queries from optimizers point of view. The optimization hints for Derby a quite nice but as our application is intended to run on z/OS DB2 (Online) and Derby (Offline) without major changes, the "--DERBY-PROPERTIES" would be the last option to choose for us. As we have several possibilities now to influence the access path I changed the prio of this issue to "Minor"
        Hide
        A B added a comment -

        Thank you for providing a repro, Stefan.

        I ran the test program against the DB4O4 database and confirmed that the fourth query does indeed take much longer than the first three.

        > I found out that the access path with PreparedStatement is different to the one of Statement

        Just to be clear here, the queries that are executed by Statement and PreparedStatement are not identical. The former (Statement) uses constants in the query predicates while the latter (PreparedStatement) uses parameters. The reason I bring this up is because when I first saw this issue, I thought the exact same statement was running more slowly with a PreparedStatement than with a Statement. But that is not the case.

        I think the real issue here is "constants vs parameters", not "Statement vs PreparedStatement". That is to say, if the query includes constants then the optimizer chooses a much better plan than if the query includes parameters. Is that an accurate assessment of the problem you are seeing?

        If so, I think this behavior actually makes sense to some degree. The reason is that when using constants, the optimizer can look at row count statistics to figure out how many rows in the table are going to match the predicates. Then it can pick a query plan that more closely aligns with the expected row count. If the constants are replaced with parameters, though, the optimizer does not know how many rows are actually going to match the predicate, and so must take a guess as to what that number will be. This guess introduces some uncertainty into the optimizer's cost estimates. And when combined with other known issues in Derby-esp. DERBY-1905 and DERBY-1907-this uncertainty in the cost estimates could then cause the optimizer to choose a bad plan.

        Note that I agree with Stefan: the query in question should not be taking over an hour to execute, so something is definitely off here. I just wanted to clarify that the issue appears to be the use of parameters (rather than the use of PreparedStatements per se).

        > Our solution was to remove the index IESVN01A and so force the PreparedStatement not to use
        > it (as the Statement already does).

        Another option for working around this issue would be to use optimizer overrides to indicate that you want a specific constraint to be used (instead of the index). See:

        http://db.apache.org/derby/docs/dev/tuning/ctunoptimzoverride.html

        You can also specify a join strategy if you find one that works best. I understand that that is not ideal, but it may provide you with more control of your queries in the short term...

        Show
        A B added a comment - Thank you for providing a repro, Stefan. I ran the test program against the DB4O4 database and confirmed that the fourth query does indeed take much longer than the first three. > I found out that the access path with PreparedStatement is different to the one of Statement Just to be clear here, the queries that are executed by Statement and PreparedStatement are not identical. The former (Statement) uses constants in the query predicates while the latter (PreparedStatement) uses parameters. The reason I bring this up is because when I first saw this issue, I thought the exact same statement was running more slowly with a PreparedStatement than with a Statement. But that is not the case. I think the real issue here is "constants vs parameters", not "Statement vs PreparedStatement". That is to say, if the query includes constants then the optimizer chooses a much better plan than if the query includes parameters. Is that an accurate assessment of the problem you are seeing? If so, I think this behavior actually makes sense to some degree. The reason is that when using constants, the optimizer can look at row count statistics to figure out how many rows in the table are going to match the predicates. Then it can pick a query plan that more closely aligns with the expected row count. If the constants are replaced with parameters, though, the optimizer does not know how many rows are actually going to match the predicate, and so must take a guess as to what that number will be. This guess introduces some uncertainty into the optimizer's cost estimates. And when combined with other known issues in Derby- esp. DERBY-1905 and DERBY-1907 -this uncertainty in the cost estimates could then cause the optimizer to choose a bad plan. Note that I agree with Stefan: the query in question should not be taking over an hour to execute, so something is definitely off here. I just wanted to clarify that the issue appears to be the use of parameters (rather than the use of PreparedStatements per se). > Our solution was to remove the index IESVN01A and so force the PreparedStatement not to use > it (as the Statement already does). Another option for working around this issue would be to use optimizer overrides to indicate that you want a specific constraint to be used (instead of the index). See: http://db.apache.org/derby/docs/dev/tuning/ctunoptimzoverride.html You can also specify a join strategy if you find one that works best. I understand that that is not ideal, but it may provide you with more control of your queries in the short term...
        Hide
        Stefan Cordes added a comment - - edited

        PreparedStatement.txt: The access path of needing execution time over 1 hour.
        PreparedStatementwithoutIndexIESVN01A.txt: The better access path leaving out an index.
        Statement.txt: The correct access path using Statement (same with and without index)
        DerbySQL26965.java:The program testing with and without index and PreparedStatement and Statement.

        Execution Result:
        Test withoutIndexIESVN01A
        1000 Rows in 6172
        1000 Rows in 5078
        Test
        1000 Rows in 6641
        1000 Rows in 4152422

        Show
        Stefan Cordes added a comment - - edited PreparedStatement.txt: The access path of needing execution time over 1 hour. PreparedStatementwithoutIndexIESVN01A.txt: The better access path leaving out an index. Statement.txt: The correct access path using Statement (same with and without index) DerbySQL26965.java:The program testing with and without index and PreparedStatement and Statement. Execution Result: Test withoutIndexIESVN01A 1000 Rows in 6172 1000 Rows in 5078 Test 1000 Rows in 6641 1000 Rows in 4152422

          People

          • Assignee:
            Unassigned
            Reporter:
            Stefan Cordes
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development