Issue Details (XML | Word | Printable)

Key: DERBY-2586
Type: Bug Bug
Status: Closed Closed
Resolution: Fixed
Priority: Major Major
Assignee: Anurag Shekhar
Reporter: Øystein Grøvlen
Votes: 0
Watchers: 0
Operations

If you were logged in you would be able to see more operations.
Derby

BlobClob4BlobTest.tesPositionAgressive takes very long time

Created: 24/Apr/07 12:54 PM   Updated: 30/Jun/09 03:55 PM
Return to search
Component/s: JDBC
Affects Version/s: None
Fix Version/s: 10.3.1.4

Time Tracking:
Not Specified

File Attachments:
  Size
Text File Licensed for inclusion in ASF works derby-2586.diff 2007-04-24 06:59 PM Anurag Shekhar 0.9 kB

Bug behavior facts: Regression
Resolution Date: 02/May/07 07:00 AM


 Description  « Hide
Dan reports that BlobClob4BlobTest.tesPositionAgressive takes very long to run:
> In a test run (junit-all) of 5,816 fixtures that took 4124 seconds (~68mins) *three* fixtures took 53% of the total time.
>
> testPositionAgressive 1564.684 seconds
> testNetworkServerSecurityMechanism 497.280 seconds
> testTypesInActionStatement 128.928 seconds

Knut Anders reports that this behavior is fairly new:
> It seems like it started taking a lot more time at revision 530085. I had
> run the tests at an earlier revision.

This was a check-in for DERBY-2346. Since Anurag is currently on vacation, I will try to look into this.

 All   Comments   Work Log   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Øystein Grøvlen added a comment - 24/Apr/07 01:08 PM
It seems like the new behavior is that characters are read one at a time when seeking for a position.
I guess that the reason may be that the new streams that are used do not do buffering. They rely on the caller to use the byte[] version of read if many characters are to be read.

However, UTF8Reader.fillBuffer reads one byte at a time. My first attempt at fixing this will be to change UTF8Reader to fill its buffer with the byte[] version of read. I think that is better than making the new streams do buffering since it seems unecessary to do buffering at several levels.

Anurag Shekhar added a comment - 24/Apr/07 06:10 PM
Easiest way of providing buffering over normal stream is to wrap it under BufferedInputStream I am trying out that change to see if it improves significant performance improvement.

I suspect the problem is with 4k buffer size in LOBStreamControl before it moves the data to file. I think i need to make this larger. While debugging I noticed even very large clob are kept in memory so all the test case of BlobClob4BlobTest.testPositionAgressive used to operate on in memory string. Now moving it onto stream has made is slow (this also explains why DERBY-2450 remained unexposed).

Anurag Shekhar added a comment - 24/Apr/07 06:59 PM
Modified UTF8Reader to wrap input stream (received via constructor) in BufferedInputStream

I haven't run the complete suite but org.apache.derbyTesting.functionTests.tests.jdbcapi._Suite runs with out any failure (in 1,111.169 sec)

org.apache.derbyTesting.functionTests.tests.jdbcapi.BlobClob4BlobTest now takes 151.561 sec.

Øystein Grøvlen added a comment - 24/Apr/07 09:55 PM
Thanks Anurag for looking at this while on vacation. Your patch looks good and I recommend it to be committed so that the runtime of the junit test suite gets back to normal.

As a side note, I do not understand why UTF8Reader buffers the converted characters. Would it not have been better to buffer the incoming bytes and rather convert characters on demand?

Daniel John Debrunner added a comment - 24/Apr/07 11:01 PM
Patch Committed revision 532127 - Thanks Anurag.

Sunitha Kambhampati added a comment - 25/Apr/07 02:38 AM
fwiw, I also looked at the change and it looks good. I ran the BlobClob4BlobTest on my linux box and before the change - it took 3197s, with this change it took 211s.

Anurag>"I suspect the problem is with 4k buffer size in LOBStreamControl before it moves the data to file."
do you plan to make some changes related to this in some other jira issue , if so can you point me to that jira issue.

Thanks.

Daniel John Debrunner added a comment - 25/Apr/07 04:38 PM
My test runs dropped from the 68mins to 39mins. Thanks Anurag and Øystein for the quick response.

Myrna van Lunteren added a comment - 01/May/07 09:28 PM
This looks like it is fixed, can it be closed?

Øystein Grøvlen added a comment - 02/May/07 07:00 AM
With the fix applied, the run time is back to normal.

Anurag Shekhar added a comment - 21/May/07 07:34 AM
sorry for the late response.
In Blob/clob before making the changes the data was held in memory
depending on the free space in the page. So some times even huge
clob/blob can be used as String. In case of testPositionAggressive all
the tests were running in this mode and String.indexOf was giving result
very fast.

Once I changed the blob/clob to hold only 4k in memory and if it exceeds
use a file it started taking very long because of the file i/o over
heads. If we change the max buffer size (higher than 4k) we will be
getting better performance for clob/blob of length smaller than the new
value. But there may be risk of getting out of memory exception if we
keep the value large and there are too many active blob and clob.I think
we need not make the buffer size too large, as right now after making
the stream buffered we are able to get performance comparable to the
older implementation.



Øystein Grøvlen added a comment - 23/May/07 11:52 AM
> Once I changed the blob/clob to hold only 4k in memory and if it exceeds
> use a file it started taking very long because of the file i/o over
> heads. If we change the max buffer size (higher than 4k) we will be
> getting better performance for clob/blob of length smaller than the new
> value. But there may be risk of getting out of memory exception if we
> keep the value large and there are too many active blob and clob.I think
> we need not make the buffer size too large, as right now after making
> the stream buffered we are able to get performance comparable to the
> older implementation.

I think it is important that temporary storage is only used when the LOB is updated. Hence, when reading a LOB, one should make sure that the LOB is not written to temporary storage, regardless of size.