Derby
  1. Derby
  2. DERBY-5698

Document performance issue with 2-arg versions of setXXXStream methods for LOBs

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 10.8.2.2
    • Fix Version/s: 10.9.1.0
    • Component/s: Documentation
    • Labels:
      None

      Description

      The PreparedStatement.setAsciiStream and other methods have a 3-arg form that includes the length of the stream and a 2-arg form that does not. If the 2-arg form is used, Derby has to calculate the length every time the method is called. With LOBs, especially large ones, this can cause a major performance impact, especially if the method is called repeatedly. This should be documented where appropriate.

      Kristian, please feel free to correct or amplify anything I've said here.

      1. DERBY-5698.diff
        3 kB
        Kim Haase
      2. DERBY-5698.stat
        0.1 kB
        Kim Haase
      3. DERBY-5698.zip
        6 kB
        Kim Haase
      4. DERBY-5698-2.diff
        4 kB
        Kim Haase
      5. DERBY-5698-2.zip
        6 kB
        Kim Haase
      6. DERBY-5698-3.diff
        4 kB
        Kim Haase
      7. DERBY-5698-3.zip
        6 kB
        Kim Haase

        Activity

        Hide
        Kim Haase added a comment -

        Attaching DERBY-5698.diff, DERBY-5698.stat, and DERBY-5698.zip, with changes to the following:

        M src/ref/rrefjavsqlprst.dita
        M src/ref/rrefjdbc4_0summary.dita

        I added the following sentence to each topic:

        "Omitting the length argument when the stream object is a LOB can impair performance, especially if the method is called repeatedly in an application or if the LOB is a large one."

        Please let me know if any change is needed to this information.

        Show
        Kim Haase added a comment - Attaching DERBY-5698 .diff, DERBY-5698 .stat, and DERBY-5698 .zip, with changes to the following: M src/ref/rrefjavsqlprst.dita M src/ref/rrefjdbc4_0summary.dita I added the following sentence to each topic: "Omitting the length argument when the stream object is a LOB can impair performance, especially if the method is called repeatedly in an application or if the LOB is a large one." Please let me know if any change is needed to this information.
        Hide
        Dag H. Wanvik added a comment -

        Thanks, Kim. Looks good to me, but I am not familiar with the original performance issue here.

        Show
        Dag H. Wanvik added a comment - Thanks, Kim. Looks good to me, but I am not familiar with the original performance issue here.
        Hide
        Kim Haase added a comment -

        Thanks, Dag! Yes, this is in Kristian's area of expertise so I'm hoping he'll take a look too.

        Show
        Kim Haase added a comment - Thanks, Dag! Yes, this is in Kristian's area of expertise so I'm hoping he'll take a look too.
        Hide
        Kristian Waagan added a comment -

        Derby uses a header to store length information for LOBs. If this header can't be filled in when the LOB is inserted, which will be the case if the LOB is inserted without specifying the length up front and the LOB is larger than a single page (typically 32 KB, but may be smaller too), Derby has no mechanism to update the header without rewriting the whole LOB at a later time.

        The insert performance will be unaffected ([1]). You'll see impaired performance if you start asking for the length of these LOBs, as Derby has to fetch and decode the whole value to find the length.
        If you're just reading data, say, with a stream, performance is unaffected.

        Assuming you inserted 101 LOBs without specifying their lengths, 100 of them being 2 GB big and one only 10 bytes, you'd be in a world of hurt with the following query:
        select min(length(myLOBs)) from mytable

        I think this can be fixed, but it will require some non-trivial work in the store.

        [1] For recent versions, there were some edge-case issues with older versions using the client driver.

        Show
        Kristian Waagan added a comment - Derby uses a header to store length information for LOBs. If this header can't be filled in when the LOB is inserted, which will be the case if the LOB is inserted without specifying the length up front and the LOB is larger than a single page (typically 32 KB, but may be smaller too), Derby has no mechanism to update the header without rewriting the whole LOB at a later time. The insert performance will be unaffected ( [1] ). You'll see impaired performance if you start asking for the length of these LOBs, as Derby has to fetch and decode the whole value to find the length. If you're just reading data, say, with a stream, performance is unaffected. Assuming you inserted 101 LOBs without specifying their lengths, 100 of them being 2 GB big and one only 10 bytes, you'd be in a world of hurt with the following query: select min(length(myLOBs)) from mytable I think this can be fixed, but it will require some non-trivial work in the store. [1] For recent versions, there were some edge-case issues with older versions using the client driver.
        Hide
        Kim Haase added a comment -

        Thanks, Kristian! So would it be better to turn the sentence into a note that says something like the following?

        "If you omit the length argument when the stream object is a LOB greater than a single page in size, performance will be impaired if you later retrieve the length of the LOB. However, if you are simply inserting or reading data, performance is unaffected. See the derby.storage.pageSize property for information about setting the page size."

        Further tweaks welcome ...

        Show
        Kim Haase added a comment - Thanks, Kristian! So would it be better to turn the sentence into a note that says something like the following? "If you omit the length argument when the stream object is a LOB greater than a single page in size, performance will be impaired if you later retrieve the length of the LOB. However, if you are simply inserting or reading data, performance is unaffected. See the derby.storage.pageSize property for information about setting the page size." Further tweaks welcome ...
        Hide
        Kim Haase added a comment -

        I'm attaching a second patch with the changes I described, just so you can check them: DERBY-5698-2.diff and DERBY-5698-2.zip.

        Show
        Kim Haase added a comment - I'm attaching a second patch with the changes I described, just so you can check them: DERBY-5698 -2.diff and DERBY-5698 -2.zip.
        Hide
        Kristian Waagan added a comment -

        The latest patch covers the issue nicely, +1.

        I'm a little uncertain about the reference to derby.storage.pageSize. We don't want to give users the impression that the weakness can be mitigated by upping the page size, and, unless overridden, Derby will do that automatically anyway. The max is 32K, which is smaller than most LOBs anyway.

        Show
        Kristian Waagan added a comment - The latest patch covers the issue nicely, +1. I'm a little uncertain about the reference to derby.storage.pageSize. We don't want to give users the impression that the weakness can be mitigated by upping the page size, and, unless overridden, Derby will do that automatically anyway. The max is 32K, which is smaller than most LOBs anyway.
        Hide
        Kim Haase added a comment -

        Thank you, Kristian! I am attaching a third patch with the sentence about the pageSize property removed, and I'll commit it if/when I resolve my SSL problems.

        Show
        Kim Haase added a comment - Thank you, Kristian! I am attaching a third patch with the sentence about the pageSize property removed, and I'll commit it if/when I resolve my SSL problems.
        Hide
        Kim Haase added a comment -

        Committed patch DERBY-5698-3.diff to documentation trunk at revision 1338837.

        Show
        Kim Haase added a comment - Committed patch DERBY-5698 -3.diff to documentation trunk at revision 1338837.
        Hide
        Kim Haase added a comment -

        Changes have appeared in Latest Alpha Manuals.

        Show
        Kim Haase added a comment - Changes have appeared in Latest Alpha Manuals.

          People

          • Assignee:
            Kim Haase
            Reporter:
            Kim Haase
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development