|
'derby-3766-1a-preparations.diff' is a preparation patch for later changes. It extracts more trivial changes to make the later patch(es) a little smaller:
a) Change type of 'myStream' from InputStream to PositionedStoreStream. This includes removing superfluous casts. b) Fixed typo in JavaDoc for 'streamLength' c) Improved JavaDoc for 'read()' d) Made exception in 'truncate' use the correct position argument. e) Added/updated a few comments. Committed 'derby-3766-1a-preparations.diff' to trunk with revision 677619.
suites.All and derbyall ran without failures with Sun JDK 1.6 on Solaris 10. Added "loose" dependency to
'derby-3766-2a-position_fix.diff' is the first attempt at fixing the performance problem.
Patch descriptions: a) Removed the 'pos' and 'biStream' variables. A "global" Blob position is no longer maintained by EmbedBlob. b) Introduced 'streamPositionOffset' to account for the encoded length bytes. Only used when the Blob is represented by a store stream. Set to Integer.MIN_VALUE when unused. Otherwise, the offset is obtained in the constructor, where the Blob length is saved for later use as well if it is known (the stream is not skipped to find length). c) Removed 'BLOB_BUF_SIZE' and 'buf', they're no longer needed. d) Renamed 'setPosition' to 'setBlobPosition' and rewrote the method. The method now relies on PositionedStoreStream to do the repositioning. In some cases this will lead to using the same algorithm as earlier, otherwise it will be more effective because the stream isn't reset. Note the addition of 'streamPositionOffset' to the logical zero-based Blob position. Also note that another SQLState is used. The method also returns the new position (this is only for convenience). e) Changes in read(); added pos argument, don't throw SQLException, use the correct stream. f) Updated length() to reflect changes. g) Rewrote the two 'position' methods to maintain a local position variable. Also removed special handling for the BLOB_LENGTH_TOO_LONG exception. h) Updated JavaDoc and added position argument for the two 'checkMatch' methods. i) Added one new test in BlobClob4BlobTest. I thought about adding a "inefficiency metric" by checking how often PositionedStoreStream resets the underlying stream, but don't know of a simple way to record and present this info. Also note that this change doesn't protect users from poor access patterns, like fetching the Blob from the end using 'getBytes' or something like that... 'position' can also suffer, depending on how many partial matches there are in the stream. I think these implementations can be optimized, as a comment in the code also says, by operating on a buffer instead of reading one byte at the time from the Blob representation. Such a change could avoid resetting the stream too much for patterns smaller than the read buffer size. The message to pass across here is that moving backwards in the Blob content is expensive. Patch ready for review. Committed patch 2a to trunk with revision 681359.
Backported patches 1a and 2a to 10.4 with revision 682803. They went in together with the second patch for
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
This is only true for the client driver, where Blob performance has dropped significantly after 10.2.
The reason is that the existing code with the performance issues has been taken into use by the implementation of locators.