Issue Details (XML | Word | Printable)

Key: DERBY-3766
Type: Bug Bug
Status: Closed Closed
Resolution: Fixed
Priority: Major Major
Assignee: Kristian Waagan
Reporter: Kristian Waagan
Votes: 0
Watchers: 0
Operations

If you were logged in you would be able to see more operations.
Derby

EmbedBlob.setPosition is highly ineffective for streams

Created: 09/Jul/08 08:43 AM   Updated: 01/Jul/09 04:29 PM
Return to search
Component/s: JDBC, Network Server
Affects Version/s: 10.1.3.1, 10.2.2.0, 10.3.3.0, 10.4.1.3, 10.5.1.1
Fix Version/s: 10.4.2.0, 10.5.1.1

Time Tracking:
Not Specified

File Attachments:
  Size
File Licensed for inclusion in ASF works derby-3766-1a-preparations.diff 2008-07-17 12:36 PM Kristian Waagan 5 kB
File Licensed for inclusion in ASF works derby-3766-2a-position_fix.diff 2008-07-18 11:33 AM Kristian Waagan 20 kB
Issue Links:
Reference
 
dependent
 

Bug behavior facts: Regression, Performance
Resolution Date: 05/Aug/08 05:14 PM


 Description  « Hide
The EmbedBlob.setPosition implementation has two performance problems when the
Blob is represented by a store stream, at least one of them rather significant:

  1) The store stream is reset to position zero for each position request.
     Data is then read until the requested position has been reached.

  2) 'read' is used instead of 'skip', which causes Derby to miss out on the
     optimization potential with streams that have a more efficient skip mechanism.

My gut feeling is that once point 1) has been fixed, point 2) will have
disappeared. Also note that the reason why the unconditional reset approach was
chosen was because the blob implementation couldn't keep track of the underlying
streams position. This issue still has to be addressed.


Performance degradation
=======================
Observations suggest the following approximation can be used to quantify the
number of bytes that have to be processed on the server. Goes for both the
embedded and the client driver when using the EmbedBlob.getBytes call.
    s = size of the blob
    b = buffer/block size
    n = s/b (number of iterations needed to read the whole Blob)

    s + b * n * (n+1) / 2 = number of bytes processed on the server side
    From now on, I ignore the s.

For a 1 MB Blob when using a buffer of 32 000 bytes, we get:
   n = 1 * 1'024 * 1'024 / 32'000 ~ 33
   32'000 * 33 * (33+1) / 2 = 17'952'000 ~ 17 MB

To quickly verify the approximation I summed up the bytes processed by
EmbedBlob.getBytes(long,int) and EmbedBlob.setPosition(long) with the 64 MB
Blob used by the repro for DERBY-550 (modified to use 32'000 read buffer):
  - approximation: 32'000 * 2097 * (2097+1) / 2 = 70392096000 ~ 66 GB
  - Derby byte count = 70459204864 ~ 66 GB
  - Derby byte count (buffer 33'000, see below) = 136526134864 ~ 127 GB

I'll explain the biggest number further down.
As you see, the number of bytes processed is huge. Note that all of the
gigabytes are just the 64 MB repeated over and over again. Since the actual data
volume is so small, all the data will be in the caches of Derby and the
operating system. Note that only 64 MB is actually transferred to the client
when using the client driver.

Another consequence of processing all this data repeatedly, is the effect it has
on the page cache. Pages has to be evicted and read back in. The performance hit
taken by this depends on the page cache size, operating system buffers and other
database activities on the system.

The explanation for the largest number above, is another performance issue in
the client driver, related to locators. I'll explain it more detailed in a
separate Jira issue, but in short the issue causes the stream to be reset even
more frequently for read buffer sizes over 32'672 bytes.


Suggested fix
=============
My initial analysis suggests that the problem can be fixed by using the
functionality provided by PosistionedStoreStream. There are a few complicating
issues:
 a) The length encoding bytes must be accounted for properly.
 b) One must make sure all stream access happens through the
    PositionedStoreStream, otherwise the position will be incorrect and the
    wrong data will be returned.

With a prototype fix, the repro duration went down from minutes (7 minutes
for the 127 GB case) down to between 2 and 4 seconds with a sane build, running
on localhost.



Affected versions
=================
The code suffering from the performance issues is old, but because it isn't
used in the same way in all versions some releases are more affected than
others.

Version Embedded.getBytes Client.getXXX
10.5 X X
10.4.1.3 X X
10.3.3.0 X X
10.2.2.1 X _
10.1.3.1 X _
(X = issue present)

 All   Comments   Work Log   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Kristian Waagan added a comment - 09/Jul/08 08:48 AM
Marking as regression.
This is only true for the client driver, where Blob performance has dropped significantly after 10.2.
The reason is that the existing code with the performance issues has been taken into use by the implementation of locators.

Kristian Waagan added a comment - 17/Jul/08 12:36 PM
'derby-3766-1a-preparations.diff' is a preparation patch for later changes. It extracts more trivial changes to make the later patch(es) a little smaller:
 a) Change type of 'myStream' from InputStream to PositionedStoreStream.
    This includes removing superfluous casts.
 b) Fixed typo in JavaDoc for 'streamLength'
 c) Improved JavaDoc for 'read()'
 d) Made exception in 'truncate' use the correct position argument.
 e) Added/updated a few comments.

Kristian Waagan added a comment - 17/Jul/08 03:47 PM
Committed 'derby-3766-1a-preparations.diff' to trunk with revision 677619.
suites.All and derbyall ran without failures with Sun JDK 1.6 on Solaris 10.

Kristian Waagan added a comment - 17/Jul/08 03:57 PM
Added "loose" dependency to DERBY-3783. It is only required to write a cleaner fix, it doesn't matter functionally.

Kristian Waagan added a comment - 18/Jul/08 11:33 AM
'derby-3766-2a-position_fix.diff' is the first attempt at fixing the performance problem.

Patch descriptions:
 a) Removed the 'pos' and 'biStream' variables. A "global" Blob position is no longer maintained by EmbedBlob.

 b) Introduced 'streamPositionOffset' to account for the encoded length bytes. Only used when the Blob is represented by a store stream. Set to Integer.MIN_VALUE when unused. Otherwise, the offset is obtained in the constructor, where the Blob length is saved for later use as well if it is known (the stream is not skipped to find length).

 c) Removed 'BLOB_BUF_SIZE' and 'buf', they're no longer needed.

 d) Renamed 'setPosition' to 'setBlobPosition' and rewrote the method. The method now relies on PositionedStoreStream to do the repositioning. In some cases this will lead to using the same algorithm as earlier, otherwise it will be more effective because the stream isn't reset. Note the addition of 'streamPositionOffset' to the logical zero-based Blob position. Also note that another SQLState is used. The method also returns the new position (this is only for convenience).

 e) Changes in read(); added pos argument, don't throw SQLException, use the correct stream.

 f) Updated length() to reflect changes.

 g) Rewrote the two 'position' methods to maintain a local position variable. Also removed special handling for the BLOB_LENGTH_TOO_LONG exception.

 h) Updated JavaDoc and added position argument for the two 'checkMatch' methods.

 i) Added one new test in BlobClob4BlobTest.

I thought about adding a "inefficiency metric" by checking how often PositionedStoreStream resets the underlying stream, but don't know of a simple way to record and present this info.

Also note that this change doesn't protect users from poor access patterns, like fetching the Blob from the end using 'getBytes' or something like that...
'position' can also suffer, depending on how many partial matches there are in the stream. I think these implementations can be optimized, as a comment in the code also says, by operating on a buffer instead of reading one byte at the time from the Blob representation. Such a change could avoid resetting the stream too much for patterns smaller than the read buffer size.

The message to pass across here is that moving backwards in the Blob content is expensive.


Patch ready for review.

Kristian Waagan added a comment - 31/Jul/08 12:44 PM
Committed patch 2a to trunk with revision 681359.

Kristian Waagan added a comment - 05/Aug/08 05:14 PM
Backported patches 1a and 2a to 10.4 with revision 682803. They went in together with the second patch for DERBY-3783.