Derby
  1. Derby
  2. DERBY-3646

Embedded returns wrong results when selecting a blob column twice and using getBinaryStream()

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 10.1.3.1, 10.2.2.0, 10.3.3.0, 10.4.1.3, 10.5.1.1
    • Fix Version/s: 10.5.3.1, 10.6.1.0
    • Component/s: JDBC
    • Labels:
      None
    • Urgency:
      Normal
    • Issue & fix info:
      High Value Fix, Repro attached
    • Bug behavior facts:
      Wrong query result

      Description

      The attached program DoubleSelect selects a blob column twice and tries to access the blob column with getBinaryStream.

      With embedded the output is:
      4 5 6 7 8 9 10 11 12 13
      14 15 16 17 18 19 20 21 22 23
      I am done

      Two things seem to be happening with embedded.
      1) Both getBinaryStream() calls are returning the same stream.
      2) The second getBinaryStream() call throws away 4 bytes.

      With client the output is:
      Exception in thread "main" java.io.IOException: The object is already
      closed.
      at
      org.apache.derby.client.am.CloseFilterInputStream.read(CloseFilterInputStream.java:50)
      at DoubleSelect.printNextTen(DoubleSelect.java:53)
      at DoubleSelect.main(DoubleSelect.java:43)
      0 1 2 3 4 5 6 7 8 9
      So with client it looks like the second getBinaryStream() call closes
      the first stream but then returns the right result for the second stream.

      Perhaps embedded should behave the same as client or perhaps the query should just work. Regardless embedded should not return wrong results.

      1. derby-3646_diff_10_5.txt
        16 kB
        Kathey Marsden
      2. DoubleSelect.java
        2 kB
        Kathey Marsden

        Issue Links

          Activity

          Hide
          Kathey Marsden added a comment -

          This issue may be related to DERBY-3645 which is another case where we select a blob column twice.

          Show
          Kathey Marsden added a comment - This issue may be related to DERBY-3645 which is another case where we select a blob column twice.
          Hide
          Kathey Marsden added a comment -

          With 10.1 client returns the right result for both columns
          java DoubleSelect client
          Connection number: 1.
          0 1 2 3 4 5 6 7 8 9
          0 1 2 3 4 5 6 7 8 9
          I am done

          10.2 forward throws the exception with client.

          The embedded behavior (wrong result) is the same for both 10.1 and 10.2

          Show
          Kathey Marsden added a comment - With 10.1 client returns the right result for both columns java DoubleSelect client Connection number: 1. 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 I am done 10.2 forward throws the exception with client. The embedded behavior (wrong result) is the same for both 10.1 and 10.2
          Hide
          Kristian Waagan added a comment -

          If I move the second getBinaryStream call down to after the printNextTen call, the client (10.4) prints the values 0 - 9 for both columns.
          I guess this is the expected behavior?
          Based on how the client closes the previous stream when getting a new one, I would say this behavior seems to be intended.

          The embedded driver (10.4), still prints the wrong values for the second column.

          Show
          Kristian Waagan added a comment - If I move the second getBinaryStream call down to after the printNextTen call, the client (10.4) prints the values 0 - 9 for both columns. I guess this is the expected behavior? Based on how the client closes the previous stream when getting a new one, I would say this behavior seems to be intended. The embedded driver (10.4), still prints the wrong values for the second column.
          Hide
          Kristian Waagan added a comment -

          After a little investigation, I found out it is easy enough to make the embedded driver behave as the client driver.
          I'm not sure what is the best exact implementation of the fix, which is currently a three line hack.
          Before working further with that, I'd like to see the community agree on what kind of behavior we want to allow.

          From what I have gathered, the spec doesn't really give anything concrete, but talks about what is recommended for maximum portability.

          Also, my first assessment is that it should be possible to allow almost anything, but that this can have significant performance implications (might include overhead for the simplest use cases as well).
          For reference, I'm listing some options, all with regard to the statement "SELECT blobclolumn as b1, blobcolumn as b2 from blobtable".
          The code examples are pseudo-JDBC/Java.

          a) Disallow selecting the same [Blob] column twice.
          Statement must be disallowed.

          b) Allow getting and processing a single stream in the order of the columns.
          streamB1 = rs.getBinaryStream(b1)
          processStream(streamB1)
          streamB2 = rs.getBinaryStream(b2) // streamB1 is automatically closed here
          processStream(streamB2)

          c) Allow getting and processing a single stream in any column order.
          streamB2 = rs.getBinaryStream(b2)
          processStream(streamB2)
          streamB1 = rs.getBinaryStream(b1) // streamB2 is automatically closed here
          processStream(streamB1)

          d) Allow getting and processing multiple streams in any order, but only once per column.
          streamB2 = rs.getBinaryStream(b2)
          streamB1 = rs.getBinaryStream(b1) // streamB2 is kept open
          processStream(streamB2)
          processStream(streamB1)

          e) Allow getting and processing mulitple streams in any order, multiple times per column.
          streamB2 = rs.getBinaryStream(b2)
          streamB1 = rs.getBinaryStream(b1) // streamB2 is kept open
          processStream(streamB2)
          streamB2seond = rs.getBinaryStream(b2) // other streams are kept open
          processStream(streamB1)
          processStream(streamB2second)

          For options d and e, I expect the state of each stream (position, open/closed) to be independent of each other. Also, a stream returned from getBinaryStream should be positioned at the beginning of the data.

          There are plenty of things I haven't considered, so please comment on this!

          Show
          Kristian Waagan added a comment - After a little investigation, I found out it is easy enough to make the embedded driver behave as the client driver. I'm not sure what is the best exact implementation of the fix, which is currently a three line hack. Before working further with that, I'd like to see the community agree on what kind of behavior we want to allow. From what I have gathered, the spec doesn't really give anything concrete, but talks about what is recommended for maximum portability. Also, my first assessment is that it should be possible to allow almost anything, but that this can have significant performance implications (might include overhead for the simplest use cases as well). For reference, I'm listing some options, all with regard to the statement "SELECT blobclolumn as b1, blobcolumn as b2 from blobtable". The code examples are pseudo-JDBC/Java. a) Disallow selecting the same [Blob] column twice. Statement must be disallowed. b) Allow getting and processing a single stream in the order of the columns. streamB1 = rs.getBinaryStream(b1) processStream(streamB1) streamB2 = rs.getBinaryStream(b2) // streamB1 is automatically closed here processStream(streamB2) c) Allow getting and processing a single stream in any column order. streamB2 = rs.getBinaryStream(b2) processStream(streamB2) streamB1 = rs.getBinaryStream(b1) // streamB2 is automatically closed here processStream(streamB1) d) Allow getting and processing multiple streams in any order, but only once per column. streamB2 = rs.getBinaryStream(b2) streamB1 = rs.getBinaryStream(b1) // streamB2 is kept open processStream(streamB2) processStream(streamB1) e) Allow getting and processing mulitple streams in any order, multiple times per column. streamB2 = rs.getBinaryStream(b2) streamB1 = rs.getBinaryStream(b1) // streamB2 is kept open processStream(streamB2) streamB2seond = rs.getBinaryStream(b2) // other streams are kept open processStream(streamB1) processStream(streamB2second) For options d and e, I expect the state of each stream (position, open/closed) to be independent of each other. Also, a stream returned from getBinaryStream should be positioned at the beginning of the data. There are plenty of things I haven't considered, so please comment on this!
          Hide
          Kathey Marsden added a comment -

          In considering options, you may want to also look at DERBY-3645, which would be handled by option a, but perhaps that is too restrictive.

          Show
          Kathey Marsden added a comment - In considering options, you may want to also look at DERBY-3645 , which would be handled by option a, but perhaps that is too restrictive.
          Hide
          Rick Hillegas added a comment -

          My $0.02:

          (a) is non-ANSI/ISO behavior.

          (b) and (c) are quirky behaviors which will puzzle users.

          (d) is limited but probably easy to explain to users.

          (e) is fully functional and should be the ultimate goal.

          Thanks,
          -Rick

          Show
          Rick Hillegas added a comment - My $0.02: (a) is non-ANSI/ISO behavior. (b) and (c) are quirky behaviors which will puzzle users. (d) is limited but probably easy to explain to users. (e) is fully functional and should be the ultimate goal. Thanks, -Rick
          Hide
          Knut Anders Hatlen added a comment -

          Triaged for 10.5.2.

          Show
          Knut Anders Hatlen added a comment - Triaged for 10.5.2.
          Hide
          Kristian Waagan added a comment -

          There are a few extra pieces of information for this issue:

          • Dag discovered a problem with the repro and also a bug in Derby, see his comments on DERBY-447 and the bug reported as DERBY-4521.
          • wrt the list above Derby now allows items a, b, and c. I think options d and e are disallowed by JDBC and the following statement from the getBinaryStream JavaDoc:
            "Note: All the data in the returned stream must be read prior to getting the value of any other column. The next call to a getter method implicitly closes the stream. Also, a stream may return 0 when the method InputStream.available is called whether there is data available or not."

          I think we could implement d if we wanted to, and maybe even e - but the latter would probably come with a performance hit. Since JDBC seems to disallow these behaviors, disallowing them in Derby seems like a good approach to me.

          • if DERBY-3844 patch 1a is committed, you are only allowed to call get[BC]lob once on a column per row. In this case the LOB object isn't closed / freed on the next getX-call.

          Fixed by the combination of DERBY-4477 and DERBY-4520.
          Verified against trunk revision 980933, but it would be nice if the reporter verified the fix as well.

          I do not plan to back-port the fix(es) due to the rather signficant changes involved, but it may be technically possible.

          Show
          Kristian Waagan added a comment - There are a few extra pieces of information for this issue: Dag discovered a problem with the repro and also a bug in Derby, see his comments on DERBY-447 and the bug reported as DERBY-4521 . wrt the list above Derby now allows items a, b, and c. I think options d and e are disallowed by JDBC and the following statement from the getBinaryStream JavaDoc: "Note: All the data in the returned stream must be read prior to getting the value of any other column. The next call to a getter method implicitly closes the stream. Also, a stream may return 0 when the method InputStream.available is called whether there is data available or not." I think we could implement d if we wanted to, and maybe even e - but the latter would probably come with a performance hit. Since JDBC seems to disallow these behaviors, disallowing them in Derby seems like a good approach to me. if DERBY-3844 patch 1a is committed, you are only allowed to call get [BC] lob once on a column per row. In this case the LOB object isn't closed / freed on the next getX-call. Fixed by the combination of DERBY-4477 and DERBY-4520 . Verified against trunk revision 980933, but it would be nice if the reporter verified the fix as well. I do not plan to back-port the fix(es) due to the rather signficant changes involved, but it may be technically possible.
          Hide
          Lily Wei added a comment -

          Reopen to 10.5 back port

          Show
          Lily Wei added a comment - Reopen to 10.5 back port
          Hide
          Kathey Marsden added a comment -

          assign to kmarsden to backport to 10.5

          Show
          Kathey Marsden added a comment - assign to kmarsden to backport to 10.5
          Hide
          Kathey Marsden added a comment -

          Here is the 10.5 patch for this issue which required some manual merge of the test.
          BLOBTest passes individually but I don't actually see it as part of Suites.All. I am not sure how it is normally run. I will commit Monday.

          Show
          Kathey Marsden added a comment - Here is the 10.5 patch for this issue which required some manual merge of the test. BLOBTest passes individually but I don't actually see it as part of Suites.All. I am not sure how it is normally run. I will commit Monday.

            People

            • Assignee:
              Dag H. Wanvik
              Reporter:
              Kathey Marsden
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development