Derby
  1. Derby
  2. DERBY-3312

Local Network Server Performance degradation with 10.2 or later

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Won't Fix
    • Affects Version/s: 10.2.1.6, 10.2.2.0, 10.3.1.4, 10.3.2.1
    • Fix Version/s: None
    • Component/s: Network Server
    • Environment:
      Intel x86 based server SuSE Linux Enterprise Server 10
      Java(TM) SE Runtime Environment (build 1.6.0_03-b05)
      Derby 10.3.2.1
    • Urgency:
      Urgent
    • Issue & fix info:
      High Value Fix, Repro attached
    • Bug behavior facts:
      Performance, Regression, Seen in production

      Description

      We have a Java based XML-RPC client/server product that incorporates an embedded Derby database and network server. Our client uses the derby JDBC ndriver and network client to connect to the Derby Network Server.

      We recently moved from Cloudscape, which I believe used the 10.1.3.1 Derby code, because of other issues which seem to be resolved by moving to the latest Derby release. We have a very simple database with a simple table. This table does include BLOBs, however its size has not been an issue and we limit our records to 500.

      Since moving to the latest release of Derby, version 10.3.2.1, we noticed that our clients running on the same machine as our server take much longer to retrieve a list of records from the database. Our clients running on a remote machine do not seem to have any performance issues when retrieving the same list of records.

      We start our Network Server in Java through the API so I don't think the Security Manager is the issue. I read that performance could be affected by the Security Manager, but according to the Derby documentation,

      "The Network Server will not attempt to install a security manager if you start the server from your application using the programmatic API ..."

      I tried going back several releases of Derby and the performance issue seems to go away when I run with version 10.1.3.1 of Derby. However we see the same issue that we saw with Cloudscape in that we can not turn off connection logging. We also had stability problems with the Network Server with Cloudscape.

      We would really prefer to use the latest Derby release however the performance issues are a sever limitation. I thought that maybe this was a simple Network Server configuration issue however after researching this issue I have not found anything from a configuration standpoint that may help.

      1. LobPerf.java
        2 kB
        Kathey Marsden

        Issue Links

          Activity

          Transition Time In Source Status Execution Times Last Executer Last Execution Date
          Open Open Resolved Resolved
          2109d 14h 54m 1 Kathey Marsden 19/Oct/13 17:24
          Resolved Resolved Closed Closed
          458d 7h 58m 1 Myrna van Lunteren 21/Jan/15 00:23
          Myrna van Lunteren made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          Hide
          Myrna van Lunteren added a comment -

          bulk change to close all issues resolved but not closed and not changed since June 1, 2014.

          Show
          Myrna van Lunteren added a comment - bulk change to close all issues resolved but not closed and not changed since June 1, 2014.
          Kathey Marsden made changes -
          Status Open [ 1 ] Resolved [ 5 ]
          Resolution Won't Fix [ 2 ]
          Hide
          Kathey Marsden added a comment -

          Received no objections to closing this won't fix. Locators improved performance on some operations and slowed others.

          Show
          Kathey Marsden added a comment - Received no objections to closing this won't fix. Locators improved performance on some operations and slowed others.
          Gavin made changes -
          Workflow jira [ 12420583 ] Default workflow, editable Closed status [ 12802573 ]
          Kathey Marsden made changes -
          Labels derby_triage10_9 derby_triage10_11
          Hide
          Kathey Marsden added a comment -

          I propose we close this issue won't fix. The introduction of locators slowed some operations significantly and that improved performance for others. I think that has become the new performance norm over the last 8 releases, so reasonable to close this won't fix.

          Show
          Kathey Marsden added a comment - I propose we close this issue won't fix. The introduction of locators slowed some operations significantly and that improved performance for others. I think that has become the new performance norm over the last 8 releases, so reasonable to close this won't fix.
          Kathey Marsden made changes -
          Labels derby_triage10_5_2 derby_triage10_9
          Hide
          Kathey Marsden added a comment -

          Using the attached reproduction, I verified this is still an issue with 10.8.2.2
          I have a new machine (Windows 7, i7 quad core, 4GB RAM) so re-baselined with some earlier versions.
          10.8.2.2 is slightly faster than 10.3 but on my fast machine now almost eight times slower than 10.1.

          10.1.3.1 Elapsed milliseconds = 78861
          10.3.3.0 Elapsed milliseconds = 533476
          10.8.2.0 Elapsed milliseconds = 479183

          I am guessing the severe performance degradation with small blobs started with the switch to using stored procedures to retrieve Blobs rather than materializing them, but it would probably be good as a first step to isolate what change in 10.3 caused the problem.

          Show
          Kathey Marsden added a comment - Using the attached reproduction, I verified this is still an issue with 10.8.2.2 I have a new machine (Windows 7, i7 quad core, 4GB RAM) so re-baselined with some earlier versions. 10.8.2.2 is slightly faster than 10.3 but on my fast machine now almost eight times slower than 10.1. 10.1.3.1 Elapsed milliseconds = 78861 10.3.3.0 Elapsed milliseconds = 533476 10.8.2.0 Elapsed milliseconds = 479183 I am guessing the severe performance degradation with small blobs started with the switch to using stored procedures to retrieve Blobs rather than materializing them, but it would probably be good as a first step to isolate what change in 10.3 caused the problem.
          Hide
          Kathey Marsden added a comment -

          Timothy, could you check and see if the fix for DERBY-3791 improved the performance?

          Show
          Kathey Marsden added a comment - Timothy, could you check and see if the fix for DERBY-3791 improved the performance?
          Kathey Marsden made changes -
          Labels LOB derby_triage10_5_2
          Kristian Waagan made changes -
          Labels LOB
          Kathey Marsden made changes -
          Bug behavior facts [Regression, Performance] [Performance, Regression, Seen in production]
          Issue & fix info [High Value Fix] [High Value Fix, Repro attached]
          Hide
          Kathey Marsden added a comment -

          Triaged for 10.5.2. I verified that this is still an issue with 10.5. The program LobPerf ran
          126375ms with 10.1.3.1
          696594ms with 10.5.1.1

          so more than 5 times slower. When I ran it seemed that later iterations were running slower, so I thought maybe the lobs were not getting garbage collected properly since they were not being freed, so perhaps thrashing was causing the slow down. I added a c.free after each row and found it to be almost three times as slow as the 10.5 run without the free: 1973109ms.

          I'll leave this as urgent since it seems to be a serious regression, although I don't know if anyone has any good ideas how we might fix this, especially in a maintenance release.

          Show
          Kathey Marsden added a comment - Triaged for 10.5.2. I verified that this is still an issue with 10.5. The program LobPerf ran 126375ms with 10.1.3.1 696594ms with 10.5.1.1 so more than 5 times slower. When I ran it seemed that later iterations were running slower, so I thought maybe the lobs were not getting garbage collected properly since they were not being freed, so perhaps thrashing was causing the slow down. I added a c.free after each row and found it to be almost three times as slow as the 10.5 run without the free: 1973109ms. I'll leave this as urgent since it seems to be a serious regression, although I don't know if anyone has any good ideas how we might fix this, especially in a maintenance release.
          Dag H. Wanvik made changes -
          Bug behavior facts [Regression] [Performance, Regression]
          Issue & fix info [High Value Fix]
          Dag H. Wanvik made changes -
          Bug behavior facts [Performance, High Value Fix] [Regression]
          Dag H. Wanvik made changes -
          Derby Categories [High Value Fix] [High Value Fix, Performance]
          Dag H. Wanvik made changes -
          Component/s Performance [ 11709 ]
          Hide
          Kristian Waagan added a comment -

          I see.

          You can continue using BLOB, I just want to point out that the performance impact will be higher the smaller the BLOB is.
          It would be nice to get some numbers for slightly larger Blobs, as minimal Blobs is the worst case scenario.

          The call to get the length is done to request the exact number of bytes from the server. If it is removed, we need to change the exit logic in the client side LOB stored procedure method.
          Also note that the BLOBs in the reported repro are so small that the repositioning bug shouldn't affect it.

          Show
          Kristian Waagan added a comment - I see. You can continue using BLOB, I just want to point out that the performance impact will be higher the smaller the BLOB is. It would be nice to get some numbers for slightly larger Blobs, as minimal Blobs is the worst case scenario. The call to get the length is done to request the exact number of bytes from the server. If it is removed, we need to change the exit logic in the client side LOB stored procedure method. Also note that the BLOBs in the reported repro are so small that the repositioning bug shouldn't affect it.
          Hide
          Timothy Graf added a comment -

          Our BLOBs vary in size so I'm not so sure about changing the column type to VARCHAR [length FOR BIT DATA] .

          We use a BLOB because we are storing a serialized Java object. This object is comprised of several other objects. We had considered storing all of the constituent object property values and rebuilding the object from values stored in a table with columns for each property instead we opted for storing the serialized object. We would prefer to continue storing the serialized object if possible.

          Show
          Timothy Graf added a comment - Our BLOBs vary in size so I'm not so sure about changing the column type to VARCHAR [length FOR BIT DATA] . We use a BLOB because we are storing a serialized Java object. This object is comprised of several other objects. We had considered storing all of the constituent object property values and rebuilding the object from values stored in a table with columns for each property instead we opted for storing the serialized object. We would prefer to continue storing the serialized object if possible.
          Hide
          Kathey Marsden added a comment -

          I just made the repro based on the user's description, so can't speak to whether the reporter could adjust his column types or not. I would think it would be common to have a mixture of small and large lobs where changing the column type would not be an option. I think your proposal to eliminate one round trip sounds like a good start if it can be done with reasonable effort.

          Show
          Kathey Marsden added a comment - I just made the repro based on the user's description, so can't speak to whether the reporter could adjust his column types or not. I would think it would be common to have a mixture of small and large lobs where changing the column type would not be an option. I think your proposal to eliminate one round trip sounds like a good start if it can be done with reasonable effort.
          Hide
          Kristian Waagan added a comment -

          I started looking at Blob again, since Clob has some additional complexity.
          I'm unable to see the performance degradation for the 10.2 versions with my Blob data (only 1 to 5 bytes long), but when I move to 10.3 it's there.
          When moving to 10.3, the time it takes to run the repro increases 3-4 times.

          I believe the performance degradation is caused by the introduction of locators, and more specifically the increased number of round-trips:
          a) We get the locator value (rs.next()).
          b) We access the server, using the locator value, to obtain the Blob length.
          c) We access the server, using the locator value, to fetch the Blob data.

          I think round-trip b) can be removed most easily, but there are details that might make it harder to do.
          With a hack I got the duration (fetching 15000 small Blobs) down from around 2100 ms to 1400 ms. The duration for 10.2 is just under 600 ms.

          I think getting the current implementation down to the duration seen with 10.2 is impossible, due to the increased number of round-trips and the extra code for LOBs.
          One could make Derby capable of using locators for only selected rows, and send the complete content where the Blob is small enough. This would increase the code complexity significantly.

          On the other hand, I'm not sure this is an issue that should be fixed by Derby. LOBs are intended to be used for large objects. Would using VARCHAR [length FOR BIT DATA] be better for situations like this?
          My repro gets down to less than 25 ms (trunk) using this data type instead of BLOB. I think the reasons are reduced number of round-trips and prefetching.
          Then the questions are, why is BLOB used is in this specific case and can it be replaced with VARCHAR length FOR BIT DATA?

          Show
          Kristian Waagan added a comment - I started looking at Blob again, since Clob has some additional complexity. I'm unable to see the performance degradation for the 10.2 versions with my Blob data (only 1 to 5 bytes long), but when I move to 10.3 it's there. When moving to 10.3, the time it takes to run the repro increases 3-4 times. I believe the performance degradation is caused by the introduction of locators, and more specifically the increased number of round-trips: a) We get the locator value (rs.next()). b) We access the server, using the locator value, to obtain the Blob length. c) We access the server, using the locator value, to fetch the Blob data. I think round-trip b) can be removed most easily, but there are details that might make it harder to do. With a hack I got the duration (fetching 15000 small Blobs) down from around 2100 ms to 1400 ms. The duration for 10.2 is just under 600 ms. I think getting the current implementation down to the duration seen with 10.2 is impossible, due to the increased number of round-trips and the extra code for LOBs. One could make Derby capable of using locators for only selected rows, and send the complete content where the Blob is small enough. This would increase the code complexity significantly. On the other hand, I'm not sure this is an issue that should be fixed by Derby. LOBs are intended to be used for large objects. Would using VARCHAR [length FOR BIT DATA] be better for situations like this? My repro gets down to less than 25 ms (trunk) using this data type instead of BLOB. I think the reasons are reduced number of round-trips and prefetching. Then the questions are, why is BLOB used is in this specific case and can it be replaced with VARCHAR length FOR BIT DATA?
          Hide
          Kristian Waagan added a comment -

          I investigated a bit further, and found out that the embedded driver has regressed too. See DERBY-3791.

          Show
          Kristian Waagan added a comment - I investigated a bit further, and found out that the embedded driver has regressed too. See DERBY-3791 .
          Kristian Waagan made changes -
          Link This issue relates to DERBY-3791 [ DERBY-3791 ]
          Kathey Marsden made changes -
          Derby Categories [High Value Fix]
          Hide
          Kristian Waagan added a comment -

          I tried the repro attached here, and saw similar results as Knut Anders describes.
          I think 10.4 ran in around 20 minutes, whereas it would take between 20 and 25 hours with 10.2!

          If I get some free cycles, I'll look into why this is the case.

          Show
          Kristian Waagan added a comment - I tried the repro attached here, and saw similar results as Knut Anders describes. I think 10.4 ran in around 20 minutes, whereas it would take between 20 and 25 hours with 10.2! If I get some free cycles, I'll look into why this is the case.
          Hide
          Knut Anders Hatlen added a comment -

          The CLOB test recently added to org.apache.derbyTesting.perf.clients.Runner is similar to the LobPerf class attached to this issue. I see a similar degradation when i compare 10.2.2.0 and the current trunk with the client and the server running on the same machine. However, if I move the server to a separate machine, 10.2.2.0 is about 50 times slower than trunk.

          Show
          Knut Anders Hatlen added a comment - The CLOB test recently added to org.apache.derbyTesting.perf.clients.Runner is similar to the LobPerf class attached to this issue. I see a similar degradation when i compare 10.2.2.0 and the current trunk with the client and the server running on the same machine. However, if I move the server to a separate machine, 10.2.2.0 is about 50 times slower than trunk.
          Kathey Marsden made changes -
          Derby Info [Regression]
          Hide
          Kathey Marsden added a comment -

          Marking as a regression since repro shows significant performance degradation with 10.3

          Kathey

          Show
          Kathey Marsden added a comment - Marking as a regression since repro shows significant performance degradation with 10.3 Kathey
          Kathey Marsden made changes -
          Attachment LobPerf.java [ 12374725 ]
          Hide
          Kathey Marsden added a comment -

          I ran the attached program with different versions of derby. It times 100 selects from a table with 14000 rows of data with small Clobs. I ran it a couple times on each version.
          Below are the averages.
          10.1.3.1 - 138257ms
          10.2.2.0 - 150976ms
          10.3.2.1 (no security manager) - 310773ms

          So for 10.2 we see some modest performance degradation, but for 10.3 we are more than twice as slow for small lobs with network server. I didn't try with embedded.

          Show
          Kathey Marsden added a comment - I ran the attached program with different versions of derby. It times 100 selects from a table with 14000 rows of data with small Clobs. I ran it a couple times on each version. Below are the averages. 10.1.3.1 - 138257ms 10.2.2.0 - 150976ms 10.3.2.1 (no security manager) - 310773ms So for 10.2 we see some modest performance degradation, but for 10.3 we are more than twice as slow for small lobs with network server. I didn't try with embedded.
          Knut Anders Hatlen made changes -
          Component/s Performance [ 11709 ]
          Daniel John Debrunner made changes -
          Field Original Value New Value
          Summary Local Network Server Performance Local Network Server Performance degradation with 10.2 or later
          Hide
          Daniel John Debrunner added a comment -

          Clarify the summary to have some meaning.

          Show
          Daniel John Debrunner added a comment - Clarify the summary to have some meaning.
          Hide
          Timothy Graf added a comment -

          This is our create table command.

          CREATE TABLE INT_LOOKUPS(
          LKP_ID INTEGER GENERATED ALWAYS AS IDENTITY (START WITH 1, INCREMENT BY 1),
          LKP_TYPE INTEGER,
          LKP_DATE DATE,
          PREV_LKP BLOB)

          From one of our test servers here is the number of record and BLOB statistics.

          Number of records 497
          Average BLOB size 2349 bytes
          Minimum BLOB size 1993 bytes
          Maximum BLOB size 2774 bytes

          Show
          Timothy Graf added a comment - This is our create table command. CREATE TABLE INT_LOOKUPS( LKP_ID INTEGER GENERATED ALWAYS AS IDENTITY (START WITH 1, INCREMENT BY 1), LKP_TYPE INTEGER, LKP_DATE DATE, PREV_LKP BLOB) From one of our test servers here is the number of record and BLOB statistics. Number of records 497 Average BLOB size 2349 bytes Minimum BLOB size 1993 bytes Maximum BLOB size 2774 bytes
          Hide
          Timothy Graf added a comment -

          Yes, for our client retrieving records over the network is faster than one of our clients running on the same machine as our server. This symptom is only seen in versions 10.2.1.6 of Derby or higher. If I revert back to version 10.1.3.1 of Derby we do not see the difference in performance for a client run locally versus over the network.

          Our clients use JDBC to retrieve the entire list of records from our only table in the database and yes, we do limit the maximum number of records to 500. I'll reply again with the additional information you requested.

          Thanks very much for your reply.

          Show
          Timothy Graf added a comment - Yes, for our client retrieving records over the network is faster than one of our clients running on the same machine as our server. This symptom is only seen in versions 10.2.1.6 of Derby or higher. If I revert back to version 10.1.3.1 of Derby we do not see the difference in performance for a client run locally versus over the network. Our clients use JDBC to retrieve the entire list of records from our only table in the database and yes, we do limit the maximum number of records to 500. I'll reply again with the additional information you requested. Thanks very much for your reply.
          Hide
          Øystein Grøvlen added a comment -

          Are you saying that retrieving records over the network is faster than local retrieval? Or just that the slow-down with the new version is larger for local retrieval than for remote retrieval? If the former, it sounds like running the clients and the server on the same computer makes the system CPU-bound. That is, that either clients or server or both use more CPU with 10.3.

          I would very much like to be able to reproduce your problems. If possible, it would be nice if you could post the DDL for the table and some information about size distribution for the Blobs involved. (If I understand you correctly there is max. 500 rows in the table.) Also of interest is what the clients do (JDBC code)? Is it just a sequence of getXXX calls for selected columns of the table?

          Show
          Øystein Grøvlen added a comment - Are you saying that retrieving records over the network is faster than local retrieval? Or just that the slow-down with the new version is larger for local retrieval than for remote retrieval? If the former, it sounds like running the clients and the server on the same computer makes the system CPU-bound. That is, that either clients or server or both use more CPU with 10.3. I would very much like to be able to reproduce your problems. If possible, it would be nice if you could post the DDL for the table and some information about size distribution for the Blobs involved. (If I understand you correctly there is max. 500 rows in the table.) Also of interest is what the clients do (JDBC code)? Is it just a sequence of getXXX calls for selected columns of the table?
          Timothy Graf created issue -

            People

            • Assignee:
              Unassigned
              Reporter:
              Timothy Graf
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development