Derby
  1. Derby
  2. DERBY-4245

Sorting a table containing a CLOB fails after upgrade to 10.5

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 10.5.1.1
    • Fix Version/s: 10.5.2.0, 10.6.1.0
    • Component/s: Store
    • Labels:
      None
    • Urgency:
      Blocker
    • Issue & fix info:
      Repro attached
    • Bug behavior facts:
      Regression

      Description

      When sorting a table containing a CLOB column in a database that has been soft-upgraded from 10.4.2.0 to 10.5.1.1, the query fails with a NullPointerException.

      Reported on derby-user: http://mail-archives.apache.org/mod_mbox/db-derby-user/200905.mbox/%3ckk5ivy.tcwvy@email.bg%3e

      1. derby-4245-4a-adjust_sort_buffer_test.diff
        2 kB
        Kristian Waagan
      2. derby-4245-1c-test.diff
        11 kB
        Kristian Waagan
      3. derby-4245-3a-unread_bytes_fix.diff
        5 kB
        Kristian Waagan
      4. derby-4245-2a-sqlclob_fix.diff
        3 kB
        Kristian Waagan
      5. derby-4245-1b-test.diff
        11 kB
        Kristian Waagan
      6. derby-4245-1a-test.diff
        12 kB
        Kristian Waagan
      7. derby-4245-0a-preview.diff
        4 kB
        Kristian Waagan
      8. MixedVersionClob.java
        0.7 kB
        Knut Anders Hatlen

        Issue Links

          Activity

          Knut Anders Hatlen created issue -
          Hide
          Knut Anders Hatlen added a comment -

          To reproduce:

          1) Run the attached class with Derby 10.4.2.0 to create the database

          2) Open the database in ij with Derby 10.5.1.1 and issue a select with an ORDER BY:

          ij version 10.5
          ij> connect 'jdbc:derby:db';
          ij> select c from t order by length(c);
          ERROR XJ001: Java exception: ': java.lang.NullPointerException'.
          ij>

          Show
          Knut Anders Hatlen added a comment - To reproduce: 1) Run the attached class with Derby 10.4.2.0 to create the database 2) Open the database in ij with Derby 10.5.1.1 and issue a select with an ORDER BY: ij version 10.5 ij> connect 'jdbc:derby:db'; ij> select c from t order by length(c); ERROR XJ001: Java exception: ': java.lang.NullPointerException'. ij>
          Knut Anders Hatlen made changes -
          Field Original Value New Value
          Attachment MixedVersionClob.java [ 12408910 ]
          Kristian Waagan made changes -
          Assignee Kristian Waagan [ kristwaa ]
          Hide
          Kristian Waagan added a comment -

          Patch 0a is a preview patch.
          I found two problems when investigating the bug;

          • SQLClob incorrectly assumes that the instance variable SQLChar.stream always reference the passed in stream to readExternal.
            This is just wrong, and is what causes the NPE.
          • LimitInputStream does not support mark/reset.

          In the case of LimitInputStream, the underlying BufferedInputStream supports mark/reset. In this cause we read 5 bytes, resets and skips two bytes. This causes the limit stream to believe it can only deliver 59998 bytes, whereas the correct number is 60003 (full length minus header length). The five bytes involved in the mark/reset are "lost".

          I'm not yet sure how to handle mark/reset with the limit reader.
          I plan to address the problems in SQLClob under this issue, and create a sub-issue for the limit stream.

          The repro passes with the preview patch, and because mark/reset isn't used much it may also pass the regression tests (the fix for LimitInputStream is broken).
          I'm starting the regression tests now, will report back with the results.

          Show
          Kristian Waagan added a comment - Patch 0a is a preview patch. I found two problems when investigating the bug; SQLClob incorrectly assumes that the instance variable SQLChar.stream always reference the passed in stream to readExternal. This is just wrong, and is what causes the NPE. LimitInputStream does not support mark/reset. In the case of LimitInputStream, the underlying BufferedInputStream supports mark/reset. In this cause we read 5 bytes, resets and skips two bytes. This causes the limit stream to believe it can only deliver 59998 bytes, whereas the correct number is 60003 (full length minus header length). The five bytes involved in the mark/reset are "lost". I'm not yet sure how to handle mark/reset with the limit reader. I plan to address the problems in SQLClob under this issue, and create a sub-issue for the limit stream. The repro passes with the preview patch, and because mark/reset isn't used much it may also pass the regression tests (the fix for LimitInputStream is broken). I'm starting the regression tests now, will report back with the results.
          Kristian Waagan made changes -
          Attachment derby-4245-0a-preview.diff [ 12408935 ]
          Hide
          Kristian Waagan added a comment -

          The regression tests passed, but that does not mean the patch is ready for commit

          Show
          Kristian Waagan added a comment - The regression tests passed, but that does not mean the patch is ready for commit
          Hide
          Knut Anders Hatlen added a comment -

          It doesn't look like the upgrade step is necessary in order to reproduce this issue. I also see the NPE when I create the database with trunk. The preview patch appears to fix it.

          Show
          Knut Anders Hatlen added a comment - It doesn't look like the upgrade step is necessary in order to reproduce this issue. I also see the NPE when I create the database with trunk. The preview patch appears to fix it.
          Kristian Waagan made changes -
          Fix Version/s 10.5.1.2 [ 12313870 ]
          Kristian Waagan made changes -
          Status Open [ 1 ] In Progress [ 3 ]
          Hide
          Kristian Waagan added a comment -

          Patch 1a contains a test for this issue.
          The test has not been enabled. On my system it takes around 130 seconds, where around half of the time is spent inserting test data. I'll see if I can reduce the number of rows inserted further.

          Patch ready for review.

          Show
          Kristian Waagan added a comment - Patch 1a contains a test for this issue. The test has not been enabled. On my system it takes around 130 seconds, where around half of the time is spent inserting test data. I'll see if I can reduce the number of rows inserted further. Patch ready for review.
          Kristian Waagan made changes -
          Attachment derby-4245-1a-test.diff [ 12410224 ]
          Kristian Waagan made changes -
          Derby Info [Regression] [Patch Available, Regression]
          Hide
          Knut Anders Hatlen added a comment -

          Would it make sense to include the seed in the test name by overriding getName() and make it return super.getName() + SEED? Then we don't need to intercept all exceptions manually in order to report the seed.

          Show
          Knut Anders Hatlen added a comment - Would it make sense to include the seed in the test name by overriding getName() and make it return super.getName() + SEED? Then we don't need to intercept all exceptions manually in order to report the seed.
          Hide
          Kristian Waagan added a comment -

          It makes sense, as it makes the code cleaner. Is patch 1b along the lines you were thinking of?

          I also attached patch 2a, which fixes the problems in SQLClob. When applying it, the bug should no longer reproduce with a clean database. The bug will still occur with upgraded databases.
          Patch 2a passes the regression tests.

          Patches ready for review.

          Show
          Kristian Waagan added a comment - It makes sense, as it makes the code cleaner. Is patch 1b along the lines you were thinking of? I also attached patch 2a, which fixes the problems in SQLClob. When applying it, the bug should no longer reproduce with a clean database. The bug will still occur with upgraded databases. Patch 2a passes the regression tests. Patches ready for review.
          Kristian Waagan made changes -
          Attachment derby-4245-1b-test.diff [ 12410320 ]
          Attachment derby-4245-2a-sqlclob_fix.diff [ 12410321 ]
          Hide
          Knut Anders Hatlen added a comment -

          Thanks. 1b looks good and I've verified that it fails without the 2a patch and passes with it. You may also want to change the calls to fail() in fetchIterateGetLengthBlob/fetchIterateGetLengthBlob to assertEquals() now that the seed is part of the name.

          The changes in 2a also look fine to me. +1 to commit.

          Show
          Knut Anders Hatlen added a comment - Thanks. 1b looks good and I've verified that it fails without the 2a patch and passes with it. You may also want to change the calls to fail() in fetchIterateGetLengthBlob/fetchIterateGetLengthBlob to assertEquals() now that the seed is part of the name. The changes in 2a also look fine to me. +1 to commit.
          Hide
          Kristian Waagan added a comment -

          Committed patch 2a to trunk with revision 784701.

          Show
          Kristian Waagan added a comment - Committed patch 2a to trunk with revision 784701.
          Kristian Waagan made changes -
          Fix Version/s 10.6.0.0 [ 12313727 ]
          Derby Info [Regression, Patch Available] [Regression]
          Hide
          Kristian Waagan added a comment -

          Patch 3a is the first revision of a fix for the issue seen for upgraded databases.
          The regression tests passed.

          Before I commit this I need to see if I understand exactly why the failure occurred. Specifically, I have to verify my own claim about the BufferedInputStream. It may be that there shouldn't be a buffered stream here. This won't affect the problem with LimitInputStream and mark/reset, but it may be an issue worth fixing anyway. I don't think there is a reason to buffer streams serving bytes from the pages in the page cache.

          Patch ready for initial review.

          Show
          Kristian Waagan added a comment - Patch 3a is the first revision of a fix for the issue seen for upgraded databases. The regression tests passed. Before I commit this I need to see if I understand exactly why the failure occurred. Specifically, I have to verify my own claim about the BufferedInputStream. It may be that there shouldn't be a buffered stream here. This won't affect the problem with LimitInputStream and mark/reset, but it may be an issue worth fixing anyway. I don't think there is a reason to buffer streams serving bytes from the pages in the page cache. Patch ready for initial review.
          Kristian Waagan made changes -
          Attachment derby-4245-3a-unread_bytes_fix.diff [ 12410674 ]
          Hide
          Kristian Waagan added a comment -

          Thank you for the feedback, Knut Anders.

          Attached and committed patch 1c to trunk with revision 785104. The test is not yet enabled.

          Show
          Kristian Waagan added a comment - Thank you for the feedback, Knut Anders. Attached and committed patch 1c to trunk with revision 785104. The test is not yet enabled.
          Kristian Waagan made changes -
          Attachment derby-4245-1c-test.diff [ 12410763 ]
          Hide
          Kristian Waagan added a comment -

          Fixed two spelling errors in the test with revision 785105.

          Show
          Kristian Waagan added a comment - Fixed two spelling errors in the test with revision 785105.
          Kristian Waagan made changes -
          Derby Info [Regression] [Patch Available, Regression]
          Hide
          Kristian Waagan added a comment -

          Code inspection shows that getting a BufferedInputStream in this case is correct. It comes from StreamFileContainer.open, and the source is a file. The merge sort (MergeSort.openSortScan) uses stream containers.
          The default buffer size is 16 KB (RawStoreFactory.STREAM_FILE_BUFFER_SIZE_DEFAULT).

          Show
          Kristian Waagan added a comment - Code inspection shows that getting a BufferedInputStream in this case is correct. It comes from StreamFileContainer.open, and the source is a file. The merge sort (MergeSort.openSortScan) uses stream containers. The default buffer size is 16 KB (RawStoreFactory.STREAM_FILE_BUFFER_SIZE_DEFAULT).
          Hide
          Knut Anders Hatlen added a comment -

          > On my system it takes around 130 seconds, where around half of the time is spent inserting test data. I'll see if I can reduce the number of rows inserted further.

          I think if you lower derby.storage.sortBufferMax, you'll see the bug with less rows in the table.

          Show
          Knut Anders Hatlen added a comment - > On my system it takes around 130 seconds, where around half of the time is spent inserting test data. I'll see if I can reduce the number of rows inserted further. I think if you lower derby.storage.sortBufferMax, you'll see the bug with less rows in the table.
          Dag H. Wanvik made changes -
          Bug behavior facts [Regression]
          Hide
          Kristian Waagan added a comment -

          Thanks for the tip, Knut.

          Patch 4a lowers the sort buffer. Although the lowered sort buffer size makes it possible to run with even fewer rows, I chose to keep it as high as it is (100 rows) due to the randomness in the test and because lowering it further didn't make the test runs that much faster.

          Committed patch 4a to trunk with revision 790162.

          Show
          Kristian Waagan added a comment - Thanks for the tip, Knut. Patch 4a lowers the sort buffer. Although the lowered sort buffer size makes it possible to run with even fewer rows, I chose to keep it as high as it is (100 rows) due to the randomness in the test and because lowering it further didn't make the test runs that much faster. Committed patch 4a to trunk with revision 790162.
          Kristian Waagan made changes -
          Hide
          Dag H. Wanvik added a comment -

          Triaged for 10.5.2, "repro attached" and setting "urgent" urgency.

          Show
          Dag H. Wanvik added a comment - Triaged for 10.5.2, "repro attached" and setting "urgent" urgency.
          Dag H. Wanvik made changes -
          Urgency Urgent
          Issue & fix info [Patch Available] [Patch Available, Repro attached]
          Hide
          Kristian Waagan added a comment -

          Committed patch 3a to trunk with revision 791398.

          Is it worth making LimitInputStream.mark(int) do nothing and have reset() throw an exception to be on the safe side?

          With the possible exception of mark/reset above, I don't expect to do any more work on this issue before I port the fixes to the 10.5 branch.

          Show
          Kristian Waagan added a comment - Committed patch 3a to trunk with revision 791398. Is it worth making LimitInputStream.mark(int) do nothing and have reset() throw an exception to be on the safe side? With the possible exception of mark/reset above, I don't expect to do any more work on this issue before I port the fixes to the 10.5 branch.
          Kristian Waagan made changes -
          Issue & fix info [Repro attached, Patch Available] [Repro attached]
          Hide
          Kathey Marsden added a comment -

          Changing to Blocker urgency. It looks like port to 10.5 is all that is needed.

          Show
          Kathey Marsden added a comment - Changing to Blocker urgency. It looks like port to 10.5 is all that is needed.
          Kathey Marsden made changes -
          Urgency Urgent Blocker
          Hide
          Kathey Marsden added a comment -

          Is this ready to port to 10.5? It is currently marked as a Blocker for the release.

          Show
          Kathey Marsden added a comment - Is this ready to port to 10.5? It is currently marked as a Blocker for the release.
          Hide
          Kristian Waagan added a comment -

          The regression tests passed on my machine for 10.5. Merged to the 10.5 branch with revision 793565.

          Show
          Kristian Waagan added a comment - The regression tests passed on my machine for 10.5. Merged to the 10.5 branch with revision 793565.
          Kristian Waagan made changes -
          Status In Progress [ 3 ] Resolved [ 5 ]
          Resolution Fixed [ 1 ]
          Kathey Marsden made changes -
          Fix Version/s 10.5.2.0 [ 12314116 ]
          Fix Version/s 10.5.1.2 [ 12313870 ]
          Kristian Waagan made changes -
          Link This issue is duplicated by DERBY-4333 [ DERBY-4333 ]
          Kristian Waagan made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          Gavin made changes -
          Workflow jira [ 12464250 ] Default workflow, editable Closed status [ 12799507 ]

            People

            • Assignee:
              Kristian Waagan
              Reporter:
              Knut Anders Hatlen
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development