Derby
  1. Derby
  2. DERBY-2349

Accessing a BLOB column twice in an INSERT trigger leads to errors in the value on-disk

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 10.3.1.4
    • Fix Version/s: 10.6.1.0
    • Component/s: SQL
    • Labels:
      None
    • Urgency:
      Normal
    • Issue & fix info:
      Repro attached
    • Bug behavior facts:
      Data corruption

      Description

      Either the BLOB is stored with the incorrect value or the value on disk does not match the stored length on disk and an exception is raised. The BLOB was supplied as a stream value.

      See this with the new TriggersTest. The text fixture will have a comment with this bug number showing how to reproduce the problem.

        Issue Links

          Activity

          Hide
          Daniel John Debrunner added a comment -

          Errors seen

          ERROR XSDA7: Restore of a serializable or SQLData object of class , attempted to read more data than was originally stored

          junit.framework.AssertionFailedError: Blobs have different lengths expected:<42879> but was:<14>

          Show
          Daniel John Debrunner added a comment - Errors seen ERROR XSDA7: Restore of a serializable or SQLData object of class , attempted to read more data than was originally stored junit.framework.AssertionFailedError: Blobs have different lengths expected:<42879> but was:<14>
          Hide
          Mamta A. Satoor added a comment -

          I have extracted a stand alone reproducible program for this jira entry and it is attached here as DERBY_2349_Repro.java I think it will make it easier to debug the problem outside of junit framework.

          Show
          Mamta A. Satoor added a comment - I have extracted a stand alone reproducible program for this jira entry and it is attached here as DERBY_2349_Repro.java I think it will make it easier to debug the problem outside of junit framework.
          Hide
          Kathey Marsden added a comment -

          DERBY-3645 may be the root cause of this issue.

          Show
          Kathey Marsden added a comment - DERBY-3645 may be the root cause of this issue.
          Hide
          Rick Hillegas added a comment -

          Triaging for 10.5.2: assigned normal urgency, flagged as data corruption, noted that a repro is available.

          Show
          Rick Hillegas added a comment - Triaging for 10.5.2: assigned normal urgency, flagged as data corruption, noted that a repro is available.
          Hide
          Kristian Waagan added a comment -

          Investigation shows that the data is actually written incorrectly to disk. Only the last piece (what's written on the last overflow page) is written for the second BLOB column.
          The same SQLBlob object is used as the source for both columns on input. Store updates the stream for this dvd to be a RememberBytesInputStream. During the insertion of the first column, this stream detects that the underlying stream (RawToBinaryInputStream) has been exhausted, and closes itself.
          When the stream is asked to fill itself for the insertion of the second column, it only has the last piece of the previous column available.

          I note that for the row trigger, the values are materialized and we don't run into this problem. However, materializing a LOB (multiple times) is a bad idea if it is big.

          It seems we have several issues open that all have the same "root cause"; reading multiple times from the same stream. The difference is the source; (a) a user stream or (b) a Derby store stream.

          (a) User stream.
          In this case we don't have a choice - we have to somehow store the user stream and then re-read the stored data. I immediately see two options, differing in performance and possibly complexity: write the value to the temporary area and use this as the source for further work, or write the value once to the store and then read from store on subsequent actions.

          b) Derby stream.
          Since materializing the value isn't a good thing for large objects, there are at least two remaining basic options; use of single stream with repositioning, or cloning of streams.
          There are challenges with both. For the former we have to make sure we reposition whenever it is required, but not excessively. For the latter, we would like to avoid cloning unless it is required.

          Another thing that would be nice for large objects, is the ability to let columns share the same field value representation. This may introduce a lot more complex bookkeeping, it requires investigation - and for this reason it sounds like a version 11 feature to me.

          There are many things I don't know the details about in the store, so this comment is aimed at generating feedback!

          Show
          Kristian Waagan added a comment - Investigation shows that the data is actually written incorrectly to disk. Only the last piece (what's written on the last overflow page) is written for the second BLOB column. The same SQLBlob object is used as the source for both columns on input. Store updates the stream for this dvd to be a RememberBytesInputStream. During the insertion of the first column, this stream detects that the underlying stream (RawToBinaryInputStream) has been exhausted, and closes itself. When the stream is asked to fill itself for the insertion of the second column, it only has the last piece of the previous column available. I note that for the row trigger, the values are materialized and we don't run into this problem. However, materializing a LOB (multiple times) is a bad idea if it is big. It seems we have several issues open that all have the same "root cause"; reading multiple times from the same stream. The difference is the source; (a) a user stream or (b) a Derby store stream. (a) User stream. In this case we don't have a choice - we have to somehow store the user stream and then re-read the stored data. I immediately see two options, differing in performance and possibly complexity: write the value to the temporary area and use this as the source for further work, or write the value once to the store and then read from store on subsequent actions. b) Derby stream. Since materializing the value isn't a good thing for large objects, there are at least two remaining basic options; use of single stream with repositioning, or cloning of streams. There are challenges with both. For the former we have to make sure we reposition whenever it is required, but not excessively. For the latter, we would like to avoid cloning unless it is required. Another thing that would be nice for large objects, is the ability to let columns share the same field value representation. This may introduce a lot more complex bookkeeping, it requires investigation - and for this reason it sounds like a version 11 feature to me. There are many things I don't know the details about in the store, so this comment is aimed at generating feedback!
          Hide
          Kristian Waagan added a comment -

          Fixed by the combination of DERBY-4477 and DERBY-4520.
          Verified against trunk revision 980933, but it would be nice if the reporter verified the fix as well.

          I do not plan to back-port the fix(es) due to the rather signficant changes involved, but it may be technically possible.

          Show
          Kristian Waagan added a comment - Fixed by the combination of DERBY-4477 and DERBY-4520 . Verified against trunk revision 980933, but it would be nice if the reporter verified the fix as well. I do not plan to back-port the fix(es) due to the rather signficant changes involved, but it may be technically possible.
          Hide
          Knut Anders Hatlen added a comment -

          [bulk update] Close all resolved issues that haven't been updated for more than one year.

          Show
          Knut Anders Hatlen added a comment - [bulk update] Close all resolved issues that haven't been updated for more than one year.

            People

            • Assignee:
              Unassigned
              Reporter:
              Daniel John Debrunner
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development