Derby
  1. Derby
  2. DERBY-3313

JDBC client driver statement cache

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 10.4.1.3
    • Fix Version/s: 10.4.2.0
    • Component/s: JDBC, Network Client
    • Labels:
      None

      Description

      A statement cache in the JDBC client driver will help increase performance in certain scenarios, for instance some multi-tier systems using connection pooling.
      Please consult the comments and documents attached to this issue for more information.

      1. package.html
        6 kB
        Kristian Waagan
      2. package.html
        6 kB
        Kristian Waagan
      3. JDBCClientStatementCacheOverview.txt
        2 kB
        Kristian Waagan
      4. JDBCClientStatementCacheOverview.txt
        3 kB
        Kristian Waagan
      5. derby-3313-1a-early_prototype.stat
        1 kB
        Kristian Waagan
      6. derby-3313-1a-early_prototype.diff
        66 kB
        Kristian Waagan
      7. bank_transaction_stmtcache_16c.png
        4 kB
        Kristian Waagan

        Issue Links

          Activity

          Hide
          Kristian Waagan added a comment -

          'JDBCClientStatementCacheOverview.txt' (revision 1.0) gives a high level overview for a client side statement cache in Derby.
          I do plan to get this done for 10.4.

          Comments for the overview document is appreciated.

          Show
          Kristian Waagan added a comment - 'JDBCClientStatementCacheOverview.txt' (revision 1.0) gives a high level overview for a client side statement cache in Derby. I do plan to get this done for 10.4. Comments for the overview document is appreciated.
          Hide
          Daniel John Debrunner added a comment -

          Just to be clear, the overview says the embedded driver already has a statement cache, but I think it's a different type of cache. The embedded driver caches statement plans (not JDBC statement objects) that can be shared across multiple connections, I think you are planning to cache JDBC statement objects for a specific connection. If you are planning the latter then you may want to think about the fact that this would also be useful in an embedded environment, thus maybe the code could be made to work for both? Not a requirement, but something to think about.

          Show
          Daniel John Debrunner added a comment - Just to be clear, the overview says the embedded driver already has a statement cache, but I think it's a different type of cache. The embedded driver caches statement plans (not JDBC statement objects) that can be shared across multiple connections, I think you are planning to cache JDBC statement objects for a specific connection. If you are planning the latter then you may want to think about the fact that this would also be useful in an embedded environment, thus maybe the code could be made to work for both? Not a requirement, but something to think about.
          Hide
          Kristian Waagan added a comment -

          Thanks for giving feedback on the overview Dan.

          I was not aware of what you state about the cache in the embedded driver, and this was useful information. I will clarify the wording in the overview, and also think about the possibility to share code between both drivers.

          Just out of curiosity, how much work do you think is saved/avoided by the current scheme when it comes to re-preparing a statement in the embedded driver?
          Are we talking about less then 10%, something like 50% or more than that?
          I'm just looking for an indication of how much can be gained from caching JDBC statement objects in the embedded driver.

          Right now I'm not convinced the code can be shared directly and easily. From what I have seen previously, entering the realm of code sharing between the two drivers can result in quite a lot more work. I will keep it in the back of my head though, and try to make the right choices happen if sharing seems feasible.

          Show
          Kristian Waagan added a comment - Thanks for giving feedback on the overview Dan. I was not aware of what you state about the cache in the embedded driver, and this was useful information. I will clarify the wording in the overview, and also think about the possibility to share code between both drivers. Just out of curiosity, how much work do you think is saved/avoided by the current scheme when it comes to re-preparing a statement in the embedded driver? Are we talking about less then 10%, something like 50% or more than that? I'm just looking for an indication of how much can be gained from caching JDBC statement objects in the embedded driver. Right now I'm not convinced the code can be shared directly and easily. From what I have seen previously, entering the realm of code sharing between the two drivers can result in quite a lot more work. I will keep it in the back of my head though, and try to make the right choices happen if sharing seems feasible.
          Hide
          Kristian Waagan added a comment -

          I have made a prototype of a client side statement cache. A description
          of the changes made follows below, and I would like feedback on the
          suggested approach. Note that this is a very early version, and there
          are probably several things that haven't received the attention they
          should. I also need to "forward" all calls in the logical prepared
          statement before it can be used. Currently it only supports executeQuery
          and close. There is also some JDBC 40 specific classes missing.
          I also know there are some synchronization issues I have to work on,
          in some places JavaDoc is missing and
          DatabaseMetaData.supportsStatementPooling must return true for the
          client.

          All changes are currently restricted to the client packages, and the
          prefix 'org.apache.derby.client' is omitted for conciseness.

          • org.apache.derby.client
            ClientPooledConnection: Added a statement cache and logic for
            instantiating it if 'maxStatements' is bigger than zero. Also added
            logic for creating either LogicalConnection (if no caching) or a
            CachingLogicalConnection. If caching is disabled, things will be
            exactly as before after a call to getConnection.
          • am
            ClientJDBCObjectFactory: Extended the interface with methods for
            creating a CachingLogicalCreation and a LogiclPreparedStatement
            (both new classes).

          CachingLogicalConnection: A new logical connection class that caches
          prepared statement objects and creates logical prepared statements.

          LogicalPreparedStatement/-40: Logical prepared statement with special
          close logic. Physical prepared statement is added to the statement
          cache on close if there is no other matching query already in there.

          • am.stmtcache
            I'm wondering if we could put this in the shared package, or are we not
            there yet on code sharing between the two drivers?
            The cache is intended to hold only a single "instance" of each
            statement, and all statements in the cache are free. If a statement is
            in use, it will not be in use. Whether a statement goes into the cache
            or not, is determined when LogicalPreparedStatement.close is called.
            Added three things;
            JDBCStatementCache: The cache itself. Holds PreparedStatement objects.

          Statement keys: An interface for the keys (StatementKey) and 5
          different keys. Only the interface is public, the implementations
          are hidden inside the package. I'm sure there are other design
          possibilities here. Must write equality tests and verify hash codes.

          StatementKeyFactory: A factory generating the appropriate keys based
          on the input information. Note that it recognizes default settings
          for some properties and generates a simplified key in some cases.

          • jdbc
            Have modified two data source classes here. I'm not 100% sure about the
            existing data source hierarchy, but making my things fit in there didn't
            require many changes. Changed files:
            ClientBaseDataSource: Added variable 'maxStatements' and a getter
            method. This property doesn't make sense for a basic data source,
            but this class is what is being passed around when data sources are
            created. Another option is to cast the object, but that requires
            instanceof checks.

          ClientConnectionPoolDataSource: Added a setter for maximum number of
          statements.

          • net
            Added implementations of the factory methods for creating caching
            connections and logical prepared statements, according to changes in the
            interface (see interface am.ClientJDBCObjectFactory).

          The caching work across multiple logical connections. Prepared
          statements are not shared across physical connections.
          Going forward I will incorporate suggestions and feedback from the
          community, write tests and continue finishing the implementation.
          I also plan to run the implementation in an application server
          environment for some early-on testing. I'm also playing with the idea of
          running suites.All with statement pooling, but I'm not sure how well
          that will work...

          When things are ready for proper review and commit, I will split the
          changes into smaller parts and submit several more or less independent
          patches.

          Diff stats for prototype patch 'derby-3313-1a-early_prototype.diff':
          client/ClientPooledConnection.java | 25 +
          client/am/CachingLogicalConnection.java | 160 ++++++++++
          client/am/ClientJDBCObjectFactory.java | 24 +
          client/am/LogicalPreparedStatement.java | 393 ++++++++++++++++++++++++++
          client/am/LogicalPreparedStatement40.java | 136 ++++++++
          client/am/stmtcache/AutoGeneratedKeysKey.java | 51 +++
          client/am/stmtcache/BasicKey.java | 65 ++++
          client/am/stmtcache/ColumnIndexesKey.java | 69 ++++
          client/am/stmtcache/ColumnNamesKey.java | 68 ++++
          client/am/stmtcache/JDBCStatementCache.java | 137 +++++++++
          client/am/stmtcache/QueryKey.java | 58 +++
          client/am/stmtcache/StatementKey.java | 56 +++
          client/am/stmtcache/StatementKeyFactory.java | 115 +++++++
          client/net/ClientJDBCObjectFactoryImpl.java | 32 ++
          client/net/ClientJDBCObjectFactoryImpl40.java | 32 ++
          jdbc/ClientBaseDataSource.java | 22 +
          jdbc/ClientConnectionPoolDataSource.java | 9
          17 files changed, 1451 insertions, 1 deletion

          All kinds of feedback are welcome

          Show
          Kristian Waagan added a comment - I have made a prototype of a client side statement cache. A description of the changes made follows below, and I would like feedback on the suggested approach. Note that this is a very early version, and there are probably several things that haven't received the attention they should. I also need to "forward" all calls in the logical prepared statement before it can be used. Currently it only supports executeQuery and close. There is also some JDBC 40 specific classes missing. I also know there are some synchronization issues I have to work on, in some places JavaDoc is missing and DatabaseMetaData.supportsStatementPooling must return true for the client. All changes are currently restricted to the client packages, and the prefix 'org.apache.derby.client' is omitted for conciseness. org.apache.derby.client ClientPooledConnection: Added a statement cache and logic for instantiating it if 'maxStatements' is bigger than zero. Also added logic for creating either LogicalConnection (if no caching) or a CachingLogicalConnection. If caching is disabled, things will be exactly as before after a call to getConnection. am ClientJDBCObjectFactory: Extended the interface with methods for creating a CachingLogicalCreation and a LogiclPreparedStatement (both new classes). CachingLogicalConnection: A new logical connection class that caches prepared statement objects and creates logical prepared statements. LogicalPreparedStatement/-40: Logical prepared statement with special close logic. Physical prepared statement is added to the statement cache on close if there is no other matching query already in there. am.stmtcache I'm wondering if we could put this in the shared package, or are we not there yet on code sharing between the two drivers? The cache is intended to hold only a single "instance" of each statement, and all statements in the cache are free. If a statement is in use, it will not be in use. Whether a statement goes into the cache or not, is determined when LogicalPreparedStatement.close is called. Added three things; JDBCStatementCache: The cache itself. Holds PreparedStatement objects. Statement keys: An interface for the keys (StatementKey) and 5 different keys. Only the interface is public, the implementations are hidden inside the package. I'm sure there are other design possibilities here. Must write equality tests and verify hash codes. StatementKeyFactory: A factory generating the appropriate keys based on the input information. Note that it recognizes default settings for some properties and generates a simplified key in some cases. jdbc Have modified two data source classes here. I'm not 100% sure about the existing data source hierarchy, but making my things fit in there didn't require many changes. Changed files: ClientBaseDataSource: Added variable 'maxStatements' and a getter method. This property doesn't make sense for a basic data source, but this class is what is being passed around when data sources are created. Another option is to cast the object, but that requires instanceof checks. ClientConnectionPoolDataSource: Added a setter for maximum number of statements. net Added implementations of the factory methods for creating caching connections and logical prepared statements, according to changes in the interface (see interface am.ClientJDBCObjectFactory). The caching work across multiple logical connections. Prepared statements are not shared across physical connections. Going forward I will incorporate suggestions and feedback from the community, write tests and continue finishing the implementation. I also plan to run the implementation in an application server environment for some early-on testing. I'm also playing with the idea of running suites.All with statement pooling, but I'm not sure how well that will work... When things are ready for proper review and commit, I will split the changes into smaller parts and submit several more or less independent patches. Diff stats for prototype patch 'derby-3313-1a-early_prototype.diff': client/ClientPooledConnection.java | 25 + client/am/CachingLogicalConnection.java | 160 ++++++++++ client/am/ClientJDBCObjectFactory.java | 24 + client/am/LogicalPreparedStatement.java | 393 ++++++++++++++++++++++++++ client/am/LogicalPreparedStatement40.java | 136 ++++++++ client/am/stmtcache/AutoGeneratedKeysKey.java | 51 +++ client/am/stmtcache/BasicKey.java | 65 ++++ client/am/stmtcache/ColumnIndexesKey.java | 69 ++++ client/am/stmtcache/ColumnNamesKey.java | 68 ++++ client/am/stmtcache/JDBCStatementCache.java | 137 +++++++++ client/am/stmtcache/QueryKey.java | 58 +++ client/am/stmtcache/StatementKey.java | 56 +++ client/am/stmtcache/StatementKeyFactory.java | 115 +++++++ client/net/ClientJDBCObjectFactoryImpl.java | 32 ++ client/net/ClientJDBCObjectFactoryImpl40.java | 32 ++ jdbc/ClientBaseDataSource.java | 22 + jdbc/ClientConnectionPoolDataSource.java | 9 17 files changed, 1451 insertions , 1 deletion All kinds of feedback are welcome
          Hide
          Bryan Pendleton added a comment -

          Can you expand on the difference between a Prepared Statement and
          a Logical Prepared Statement? And also on the difference between a
          Connection and a Logical Connection? Are these already-existing
          concepts in the client? Or did you introduce these concepts? Thanks!

          Show
          Bryan Pendleton added a comment - Can you expand on the difference between a Prepared Statement and a Logical Prepared Statement? And also on the difference between a Connection and a Logical Connection? Are these already-existing concepts in the client? Or did you introduce these concepts? Thanks!
          Hide
          Kristian Waagan added a comment -

          Thank you for your interest in this issue Bryan!

          A logical entity is a wrapper around a "physical" entity. It is a mechanism to avoid closing the physical entity when a user asks to close the object he/she has a reference to. You also need to make sure that the user can't obtain a reference to the physical entity after the logical entity has been closed. In many scenarios this can be fatal, for instance that different worker threads in an application server get in each others way.

          The logical entity concept already exists in the driver, for instance for connections (used when connections are pooled). I have however introduced it for prepared statements. Oversimplified, the only thing it needs to do is forward all calls to the physical prepared statement and possibly execute some special logic on close. In this case, it is putting the physical prepared statement into the cache if appropriate. The logical entity will also release/nullify any references to its physical entity.

          One physical entity will typically do work for several logical entities during its lifespan, but at different times (non-overlapping). Further, a logical entity is typically a lot cheaper to instantiate than a physical entity.
          An illustration of this is the statement cache in the client driver. Instead of going over the network to the server and re-prepare a statement there, you can simply wrap the existing (physical) prepared statement in the client driver. You save time spent in the network and time/work on the server. The actual benefit from such a cache is of course highly dependent on the application's usage of prepared statements.

          Hope this made sense, if not, ask again!

          Show
          Kristian Waagan added a comment - Thank you for your interest in this issue Bryan! A logical entity is a wrapper around a "physical" entity. It is a mechanism to avoid closing the physical entity when a user asks to close the object he/she has a reference to. You also need to make sure that the user can't obtain a reference to the physical entity after the logical entity has been closed. In many scenarios this can be fatal, for instance that different worker threads in an application server get in each others way. The logical entity concept already exists in the driver, for instance for connections (used when connections are pooled). I have however introduced it for prepared statements. Oversimplified, the only thing it needs to do is forward all calls to the physical prepared statement and possibly execute some special logic on close. In this case, it is putting the physical prepared statement into the cache if appropriate. The logical entity will also release/nullify any references to its physical entity. One physical entity will typically do work for several logical entities during its lifespan, but at different times (non-overlapping). Further, a logical entity is typically a lot cheaper to instantiate than a physical entity. An illustration of this is the statement cache in the client driver. Instead of going over the network to the server and re-prepare a statement there, you can simply wrap the existing (physical) prepared statement in the client driver. You save time spent in the network and time/work on the server. The actual benefit from such a cache is of course highly dependent on the application's usage of prepared statements. Hope this made sense, if not, ask again!
          Hide
          Bryan Pendleton added a comment -

          Thanks Kristian! Are the following statements true?

          • for each physical PreparedStatement, there will be exactly 1 logical PreparedStatement
          • for each logical PreparedStatement, there will be exactly 1 physical PreparedStatement

          Or, is there ever a case where two different logical PreparedStatements point to the
          same physical PreparedStatement?

          Show
          Bryan Pendleton added a comment - Thanks Kristian! Are the following statements true? for each physical PreparedStatement, there will be exactly 1 logical PreparedStatement for each logical PreparedStatement, there will be exactly 1 physical PreparedStatement Or, is there ever a case where two different logical PreparedStatements point to the same physical PreparedStatement?
          Hide
          Kristian Waagan added a comment -

          Bryan wrote:
          > Thanks Kristian! Are the following statements true?
          > - for each physical PreparedStatement, there will be exactly 1 logical PreparedStatement
          I think this should be rephrased as "for each physical PreparedStatement, there will be exactly 1 active logical PreparedStatement".
          A logical prepared statement is activated when it is preprared (Connection.prepareStatement) and is no longer active when close has been called.
          There might be error situations causing the statement to be closed as well.

          Over time, one physical PreparedStatement can serve a number of logical PreparedStatements.

          > - for each logical PreparedStatement, there will be exactly 1 physical PreparedStatement
          Yes, this is true.
          If a logical prepared statement has a reference to a physical prepared statement, it is the only one with such a reference.
          Note that after a close, a logical prepared statement does not refer any physical prepared statement and is eligible for garbage collection when the client no longer references it.

          > Or, is there ever a case where two different logical PreparedStatements point to the
          > same physical PreparedStatement?
          No.

          Show
          Kristian Waagan added a comment - Bryan wrote: > Thanks Kristian! Are the following statements true? > - for each physical PreparedStatement, there will be exactly 1 logical PreparedStatement I think this should be rephrased as "for each physical PreparedStatement, there will be exactly 1 active logical PreparedStatement". A logical prepared statement is activated when it is preprared (Connection.prepareStatement) and is no longer active when close has been called. There might be error situations causing the statement to be closed as well. Over time, one physical PreparedStatement can serve a number of logical PreparedStatements. > - for each logical PreparedStatement, there will be exactly 1 physical PreparedStatement Yes, this is true. If a logical prepared statement has a reference to a physical prepared statement, it is the only one with such a reference. Note that after a close, a logical prepared statement does not refer any physical prepared statement and is eligible for garbage collection when the client no longer references it. > Or, is there ever a case where two different logical PreparedStatements point to the > same physical PreparedStatement? No.
          Hide
          Kristian Waagan added a comment -

          At the moment I'm looking into handling of a statement's resources when closing a logical prepared statement.
          I think this issue will require some changes to existing code (one or more of new methods, refactoring or changes to existing methods).
          Note that these issues are not handled at all in the current prototype. For instance, the statement's result sets are left open in the prototype.

          Show
          Kristian Waagan added a comment - At the moment I'm looking into handling of a statement's resources when closing a logical prepared statement. I think this issue will require some changes to existing code (one or more of new methods, refactoring or changes to existing methods). Note that these issues are not handled at all in the current prototype. For instance, the statement's result sets are left open in the prototype.
          Hide
          Bryan Pendleton added a comment -

          Hi Kristian, thanks for the clarifications. I think I'm understanding the idea better now.

          Did I understand correctly that pooled connections may have prepared statement
          caches, but non-pooled connections will not?

          Does each pooled connection have its own cache? If I do something like:
          Connection c = getPooledConnection();
          c.prepareStatement("select * from employee");
          then will the behavior depend on whether or not this pooled connection has
          previously prepared this statement?

          What replacement policy does the statement cache use when it becomes full?

          Thanks again for being patient with my questions.

          Show
          Bryan Pendleton added a comment - Hi Kristian, thanks for the clarifications. I think I'm understanding the idea better now. Did I understand correctly that pooled connections may have prepared statement caches, but non-pooled connections will not? Does each pooled connection have its own cache? If I do something like: Connection c = getPooledConnection(); c.prepareStatement("select * from employee"); then will the behavior depend on whether or not this pooled connection has previously prepared this statement? What replacement policy does the statement cache use when it becomes full? Thanks again for being patient with my questions.
          Hide
          Kristian Waagan added a comment -

          Bryan wrote:
          > Did I understand correctly that pooled connections may have prepared statement
          > caches, but non-pooled connections will not?
          Yes, this is correct. Whether a pooled connection will have a cache or not is controlled by the method setMaxStatements of the data source. A value of zero disables statement pooling. The size of the cache/pool is determined at connection creation time only.
          I need to clarify the situation for XA connections.

          > Does each pooled connection have its own cache? If I do something like:
          > Connection c = getPooledConnection();
          > c.prepareStatement("select * from employee");
          > then will the behavior depend on whether or not this pooled connection has
          > previously prepared this statement?
          Yes, each pooled connection will have its own cache.
          There is no caching across pooled connections. A pooled connection is, or is a wrapper for, a physical connection.

          Nitpick:
          The illustrative code can be a tad misleading. What happens in a connection pool, is more like this:
          ConnectionPoolDataSource cpDs = new ClientConnectionPoolDataSource();
          PooledConnection pooledCon = cpDs.getPooledConnection();
          Connection clientCon = pooledCon.getConnection();
          // Client uses this connection and eventually closes it.
          Connection nextClientCon = pooledCon.getConnection();
          // Next client does its things.
          // and so on...
          pooledCon.getConnection() returns a logical connection. In my current approach, this will return an instance of either LogicalConnection (no statement cache) or CachingLogicalConnection (with statement caching).

          > What replacement policy does the statement cache use when it becomes full?
          I have planned to throw out the oldest statement of the cache. The approach is to take statements out of the cache when it is used. If the statement is put into the cache again, it will be inserted as the newest element. The oldest element will thus also be the least frequently used element.

          Implementation wise, I plan to use a LinkedHashMap. This has support for both insertion-based and access-based ordering, and it has a mechanism that is very easy to use for throwing out elements (override method removeEldestElement).

          Show
          Kristian Waagan added a comment - Bryan wrote: > Did I understand correctly that pooled connections may have prepared statement > caches, but non-pooled connections will not? Yes, this is correct. Whether a pooled connection will have a cache or not is controlled by the method setMaxStatements of the data source. A value of zero disables statement pooling. The size of the cache/pool is determined at connection creation time only. I need to clarify the situation for XA connections. > Does each pooled connection have its own cache? If I do something like: > Connection c = getPooledConnection(); > c.prepareStatement("select * from employee"); > then will the behavior depend on whether or not this pooled connection has > previously prepared this statement? Yes, each pooled connection will have its own cache. There is no caching across pooled connections. A pooled connection is, or is a wrapper for, a physical connection. Nitpick: The illustrative code can be a tad misleading. What happens in a connection pool, is more like this: ConnectionPoolDataSource cpDs = new ClientConnectionPoolDataSource(); PooledConnection pooledCon = cpDs.getPooledConnection(); Connection clientCon = pooledCon.getConnection(); // Client uses this connection and eventually closes it. Connection nextClientCon = pooledCon.getConnection(); // Next client does its things. // and so on... pooledCon.getConnection() returns a logical connection. In my current approach, this will return an instance of either LogicalConnection (no statement cache) or CachingLogicalConnection (with statement caching). > What replacement policy does the statement cache use when it becomes full? I have planned to throw out the oldest statement of the cache. The approach is to take statements out of the cache when it is used. If the statement is put into the cache again, it will be inserted as the newest element. The oldest element will thus also be the least frequently used element. Implementation wise, I plan to use a LinkedHashMap. This has support for both insertion-based and access-based ordering, and it has a mechanism that is very easy to use for throwing out elements (override method removeEldestElement).
          Hide
          Kristian Waagan added a comment -

          I have spent some time looking into code sharing and have come to the conclusion that I will not deal with it at this time.
          There was lots of discussion a while ago, but no viable/acceptable solution for enabling "proper" code sharing was found.
          My decision is based on the limited time until feature freeze, and on my impression that code sharing can be a hard nut to crack.

          I do however still believe parts of the code can be shared between the embedded and the client driver, specifically the implementation of the cache itself (currently located in the "am.stmtcache" package).

          Show
          Kristian Waagan added a comment - I have spent some time looking into code sharing and have come to the conclusion that I will not deal with it at this time. There was lots of discussion a while ago, but no viable/acceptable solution for enabling "proper" code sharing was found. My decision is based on the limited time until feature freeze, and on my impression that code sharing can be a hard nut to crack. I do however still believe parts of the code can be shared between the embedded and the client driver, specifically the implementation of the cache itself (currently located in the "am.stmtcache" package).
          Hide
          Kristian Waagan added a comment -

          I created sub-tasks for what I believe will be the various parts of the statement pooling feature implementation.
          DERBY-3326 is expected to be the one requiring most work, and it is here the issues regarding correctness, proper behavior and edge-cases will be handled.

          Basic testing will be handled in each sub-task, and there might a separate patch in the end with more complex tests.
          I expect to provide patches for DERBY-3324 and DERBY-3325 first.

          Show
          Kristian Waagan added a comment - I created sub-tasks for what I believe will be the various parts of the statement pooling feature implementation. DERBY-3326 is expected to be the one requiring most work, and it is here the issues regarding correctness, proper behavior and edge-cases will be handled. Basic testing will be handled in each sub-task, and there might a separate patch in the end with more complex tests. I expect to provide patches for DERBY-3324 and DERBY-3325 first.
          Hide
          Kristian Waagan added a comment -

          Updated 'JDBCClientStatementCacheOverview.txt' based on comments, and also added a little more information.

          Show
          Kristian Waagan added a comment - Updated 'JDBCClientStatementCacheOverview.txt' based on comments, and also added a little more information.
          Hide
          Kristian Waagan added a comment -

          Linking to DERBY-3596, as that issue has severe consequences for performance and stability of the JDBC statement pooling feature.
          This means that the use of statement pooling in the client driver should be delayed till the next update release for 10.4 or the next feature release.
          I believe this has been mentioned in the release notes for 10.4.

          Show
          Kristian Waagan added a comment - Linking to DERBY-3596 , as that issue has severe consequences for performance and stability of the JDBC statement pooling feature. This means that the use of statement pooling in the client driver should be delayed till the next update release for 10.4 or the next feature release. I believe this has been mentioned in the release notes for 10.4.
          Hide
          Kristian Waagan added a comment -

          'bank_transaction_stmtcache_16c.png' shows a graph for the performance of Derby running the bank transaction performance client (see DERBY-3619) in three different configurations;
          a) Reuse of a single prepared statement for each query (original client code)
          b) Reprepare the statement every time with JDBC statement caching enabled
          c) Reprepare the statement every time with JDBC statement caching disabled

          I modified the original performance client to achieve b and c.
          The Derby network server was started with a page cache size of 10 000 and a max heap of 512 M, and the JDBC statement cache size was set to 15.
          The client was invoked with the following arguments:
          -driver org.apache.derby.jdbc.ClientDriver -url "jdbc:derby://myHost/tpcbDB;create=true" -load bank_tx -threads 16 -load_opts "branches=16,accountsPerBranch=6250" [-init]

          As the graph shows, the performance hit from repreparing statements with the JDBC client side statement cache is almost negligible with this load profile. The extra work is performed on the client, and in this case there was plenty of free CPU resourced on the client machine. The fact that there are many clients makes the difference even smaller.

          I'm also running with only one client, and the results seem to suggest a small performance overhead for the JDBC statement cache. This is to be expected, as more work has to be done than when reusing a single prepared statement. I'll post the results later when the run is finished.

          If I get some free cycles, I might run a read-only test too.

          Show
          Kristian Waagan added a comment - 'bank_transaction_stmtcache_16c.png' shows a graph for the performance of Derby running the bank transaction performance client (see DERBY-3619 ) in three different configurations; a) Reuse of a single prepared statement for each query (original client code) b) Reprepare the statement every time with JDBC statement caching enabled c) Reprepare the statement every time with JDBC statement caching disabled I modified the original performance client to achieve b and c. The Derby network server was started with a page cache size of 10 000 and a max heap of 512 M, and the JDBC statement cache size was set to 15. The client was invoked with the following arguments: -driver org.apache.derby.jdbc.ClientDriver -url "jdbc:derby://myHost/tpcbDB;create=true" -load bank_tx -threads 16 -load_opts "branches=16,accountsPerBranch=6250" [-init] As the graph shows, the performance hit from repreparing statements with the JDBC client side statement cache is almost negligible with this load profile. The extra work is performed on the client, and in this case there was plenty of free CPU resourced on the client machine. The fact that there are many clients makes the difference even smaller. I'm also running with only one client, and the results seem to suggest a small performance overhead for the JDBC statement cache. This is to be expected, as more work has to be done than when reusing a single prepared statement. I'll post the results later when the run is finished. If I get some free cycles, I might run a read-only test too.
          Hide
          Kristian Waagan added a comment -

          'package.html' is the first revision of a simple package level description of the JDBC statement cache.

          It would be nice if someone could read through it and comment on two things:
          1) Language errors
          2) Missing or poor information about the cache.

          For instance, more information regarding the implementation could be added.

          Show
          Kristian Waagan added a comment - 'package.html' is the first revision of a simple package level description of the JDBC statement cache. It would be nice if someone could read through it and comment on two things: 1) Language errors 2) Missing or poor information about the cache. For instance, more information regarding the implementation could be added.
          Hide
          Knut Anders Hatlen added a comment -

          The description looks very good to me. It would be good if some of this information could also be put into one of the manuals. I think most of it could be copied directly.

          Show
          Knut Anders Hatlen added a comment - The description looks very good to me. It would be good if some of this information could also be put into one of the manuals. I think most of it could be copied directly.
          Hide
          Kristian Waagan added a comment -

          Knut Anders, thanks for looking at the documentation in package.html.
          I committed a slightly modified version (attached, removed first header due to JavaDoc tool conventions and added links to implementation classes at the bottom) to trunk with revision 675508 and merged to 10.4 with revision 675510.

          Further changes to the package description can be done later.

          Show
          Kristian Waagan added a comment - Knut Anders, thanks for looking at the documentation in package.html. I committed a slightly modified version (attached, removed first header due to JavaDoc tool conventions and added links to implementation classes at the bottom) to trunk with revision 675508 and merged to 10.4 with revision 675510. Further changes to the package description can be done later.
          Hide
          Kristian Waagan added a comment -

          No more work currently planned for the feature.

          Show
          Kristian Waagan added a comment - No more work currently planned for the feature.
          Hide
          Rick Hillegas added a comment -

          Marking Fix Version as 10.4.2 because the work has been ported to 10.4.

          Show
          Rick Hillegas added a comment - Marking Fix Version as 10.4.2 because the work has been ported to 10.4.
          Hide
          Kristian Waagan added a comment -

          Closing the issue, as there is no more work planned.
          The feature has gone through initial testing (code inspection, Derby regression tests and an enterprise level benchmark) and documentation has been added.

          Show
          Kristian Waagan added a comment - Closing the issue, as there is no more work planned. The feature has gone through initial testing (code inspection, Derby regression tests and an enterprise level benchmark) and documentation has been added.
          Hide
          Brett Wooldridge added a comment -

          I would like to re-open this issue. Prepared statement caching is still needed on XA connections.

          Show
          Brett Wooldridge added a comment - I would like to re-open this issue. Prepared statement caching is still needed on XA connections.
          Hide
          Kristian Waagan added a comment -

          Hi Brett,

          Are you in a position where you can test out a build that has caching enabled for XA connections?
          I'd need to have a look at the code again, but enabling caching for XA connections is very easy. However, the XA code has some special logic here and there, and I would have to check if any of these interferes with the cache.

          I'd prefer if you create a separate issue for tracking the work on prepared statement caching for XA connections (mostly due to the amount of comments on this one), but feel free to reopen this issue and link to the new one.

          Show
          Kristian Waagan added a comment - Hi Brett, Are you in a position where you can test out a build that has caching enabled for XA connections? I'd need to have a look at the code again, but enabling caching for XA connections is very easy. However, the XA code has some special logic here and there, and I would have to check if any of these interferes with the cache. I'd prefer if you create a separate issue for tracking the work on prepared statement caching for XA connections (mostly due to the amount of comments on this one), but feel free to reopen this issue and link to the new one.

            People

            • Assignee:
              Kristian Waagan
              Reporter:
              Kristian Waagan
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development