Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 10.3.1.4, 10.5.3.0, 10.6.1.0
    • Fix Version/s: 10.3.3.1, 10.4.2.1, 10.5.3.1, 10.6.1.0
    • Component/s: SQL
    • Labels:
      None
    • Issue & fix info:
      Repro attached
    • Bug behavior facts:
      Regression

      Description

      On Derby 10.3.1.4 and later, I see that evaluating the statements below in ij apparently makes Derby go into an infinite loop. The select statement ran for two hours until I aborted it. I do not see this problem on Derby 10.2.2.0 or earlier.

      ij> create table t(x int primary key);
      0 rows inserted/updated/deleted
      ij> prepare ps as 'select * from t where x=? or x=?';
      ij> execute ps using 'values (cast(null as int), 0)';
      IJ WARNING: Autocommit may close using result set
      X
      -----------

      1. derby-4376-1a.diff
        16 kB
        Knut Anders Hatlen
      2. derby-4376-1a.stat
        0.3 kB
        Knut Anders Hatlen

        Issue Links

          Activity

          Hide
          Knut Anders Hatlen added a comment -

          The query uses a MultiProbeTableScanResultSet, which was introduced in 10.4.1.3 as part of DERBY-47.

          Show
          Knut Anders Hatlen added a comment - The query uses a MultiProbeTableScanResultSet, which was introduced in 10.4.1.3 as part of DERBY-47 .
          Hide
          Knut Anders Hatlen added a comment -

          The next probe value in an IN list (the query in the repro is rewritten to and IN query internally) is fetched by MPTSRS.getNextProbeValue() which is called called by MPTSRS.reopenScanController(). In MPTSRS's parent class (TableScanResultSet) we find this in reopenCore():

          // Check whether there are any comparisons with unordered nulls
          // on either the start or stop position. If there are, we can
          // (and must) skip the scan, because no rows can qualify
          if (skipScan(startPosition, stopPosition))

          { scanControllerOpened = false; }

          else

          { if (scanController == null) openScanController((TransactionController)null); else reopenScanController(); }

          So if startPosition/stopPosition contains a NULL value, reopenScanController() is not called. This means that a new probe value is not fetched, and at the next iteration we'll still be at the same position in the IN list, so we'll just continue trying to reopen the scan with start/stop==NULL again and again.

          Show
          Knut Anders Hatlen added a comment - The next probe value in an IN list (the query in the repro is rewritten to and IN query internally) is fetched by MPTSRS.getNextProbeValue() which is called called by MPTSRS.reopenScanController(). In MPTSRS's parent class (TableScanResultSet) we find this in reopenCore(): // Check whether there are any comparisons with unordered nulls // on either the start or stop position. If there are, we can // (and must) skip the scan, because no rows can qualify if (skipScan(startPosition, stopPosition)) { scanControllerOpened = false; } else { if (scanController == null) openScanController((TransactionController)null); else reopenScanController(); } So if startPosition/stopPosition contains a NULL value, reopenScanController() is not called. This means that a new probe value is not fetched, and at the next iteration we'll still be at the same position in the IN list, so we'll just continue trying to reopen the scan with start/stop==NULL again and again.
          Hide
          Knut Anders Hatlen added a comment -

          Also, TableScanResultSet.reopenCore() will always reset the start key and the stop key to the first value in the IN list right before we check if the scan can be skipped. So when the first element in the IN list is NULL, the start/stop key will always be NULL when skipScan() is called, and skipScan() will always return true.

          For MultiProbeTableScanResultSet I think we need to somehow set the start/stop key to the actual probe value before skipScan() is called.

          Show
          Knut Anders Hatlen added a comment - Also, TableScanResultSet.reopenCore() will always reset the start key and the stop key to the first value in the IN list right before we check if the scan can be skipped. So when the first element in the IN list is NULL, the start/stop key will always be NULL when skipScan() is called, and skipScan() will always return true. For MultiProbeTableScanResultSet I think we need to somehow set the start/stop key to the actual probe value before skipScan() is called.
          Hide
          Knut Anders Hatlen added a comment -

          Forgot to mention that initializing start/stop key to the first value in the IN list is documented in PredicateList (see the comment below). Anyway, since it's just a placeholder, it is not correct to use that value as the argument to skipScan().

          /* If we have a "useful" IN list probe predicate we will generate a

          • start/stop key for optTable of the form "col = <val>", where <val>
          • is the first value in the IN-list. Then during normal index multi-
          • probing (esp. as implemented by exec/MultiProbeTableScanResultSet)
          • we will use that start/stop key as a "placeholder" into which we'll
          • plug the values from the IN-list one at a time.
          Show
          Knut Anders Hatlen added a comment - Forgot to mention that initializing start/stop key to the first value in the IN list is documented in PredicateList (see the comment below). Anyway, since it's just a placeholder, it is not correct to use that value as the argument to skipScan(). /* If we have a "useful" IN list probe predicate we will generate a start/stop key for optTable of the form "col = <val>", where <val> is the first value in the IN-list. Then during normal index multi- probing (esp. as implemented by exec/MultiProbeTableScanResultSet) we will use that start/stop key as a "placeholder" into which we'll plug the values from the IN-list one at a time.
          Hide
          Knut Anders Hatlen added a comment -

          Here's a patch with a fix for this issue, and with a test case.

          The patch moves all the multi-probe logic out of TableScanResultSet and into MultiProbeTableScanResultSet, and instead created methods that MPTSRS could override where different logic was needed. In short:

          1) Initialization of the start/stop keys has been factored out of TSRS.openCore()/TSRS.reopenCore() into a helper method initStartAndStopKey().

          2) MPTSRS overrides initStartAndStopKey() and updates the keys with the actual probe value. This ensures that the keys now have the correct values when TSRS.reopenCore() calls skipScan().

          3) Methods in TSRS that take a probeValue argument have been removed, since MPTSRS.initStartAndStopKey() now does all the work with retrieving the probe value and updating the keys.

          4) Since the next probe value is fetched earlier now MPTSRS.reopenScanController() can no longer use the return value from getNextProbeValue() to decide whether or not it should be a no-op (it should be a no-op the next probe value was null, which means that the probe list has been exhausted). To ensure that we don't open scans in those cases, set a flag in initStartAndStopKey() to indicate whether or not the probe list was exhausted, and override skipScan() to check that flag and return true if no new probe value was found.

          (As a side note to 4, this should only happen when the last element in the probe list is equal to the previous element. I cannot see that this special case is tested by InListMultiProbeTest. Looking at Ole's weekly test coverage report, it looks like the code that handles duplicate values in the probe list is not exercised at all by our current tests. I will add some test cases to exercise those code paths, but in a separate patch/issue since it's not directly related to this bug.)

          I've started a full run of the regression test suites.

          Show
          Knut Anders Hatlen added a comment - Here's a patch with a fix for this issue, and with a test case. The patch moves all the multi-probe logic out of TableScanResultSet and into MultiProbeTableScanResultSet, and instead created methods that MPTSRS could override where different logic was needed. In short: 1) Initialization of the start/stop keys has been factored out of TSRS.openCore()/TSRS.reopenCore() into a helper method initStartAndStopKey(). 2) MPTSRS overrides initStartAndStopKey() and updates the keys with the actual probe value. This ensures that the keys now have the correct values when TSRS.reopenCore() calls skipScan(). 3) Methods in TSRS that take a probeValue argument have been removed, since MPTSRS.initStartAndStopKey() now does all the work with retrieving the probe value and updating the keys. 4) Since the next probe value is fetched earlier now MPTSRS.reopenScanController() can no longer use the return value from getNextProbeValue() to decide whether or not it should be a no-op (it should be a no-op the next probe value was null, which means that the probe list has been exhausted). To ensure that we don't open scans in those cases, set a flag in initStartAndStopKey() to indicate whether or not the probe list was exhausted, and override skipScan() to check that flag and return true if no new probe value was found. (As a side note to 4, this should only happen when the last element in the probe list is equal to the previous element. I cannot see that this special case is tested by InListMultiProbeTest. Looking at Ole's weekly test coverage report, it looks like the code that handles duplicate values in the probe list is not exercised at all by our current tests. I will add some test cases to exercise those code paths, but in a separate patch/issue since it's not directly related to this bug.) I've started a full run of the regression test suites.
          Hide
          Knut Anders Hatlen added a comment -

          Created DERBY-4378 for the missing duplicate tests.

          Show
          Knut Anders Hatlen added a comment - Created DERBY-4378 for the missing duplicate tests.
          Hide
          Bryan Pendleton added a comment -

          Ha! I misread your comment at first and it made me do a double-take.

          Then I re-read it, and realized that you meant "missing tests of duplicate values"

          To keep this comment from being entirely light-hearted, I think your overall approach
          (refactoring the logic so that MPTSRS can override it more cleanly) sounds excellent.

          Show
          Bryan Pendleton added a comment - Ha! I misread your comment at first and it made me do a double-take. Then I re-read it, and realized that you meant "missing tests of duplicate values" To keep this comment from being entirely light-hearted, I think your overall approach (refactoring the logic so that MPTSRS can override it more cleanly) sounds excellent.
          Hide
          Knut Anders Hatlen added a comment -

          Yes, that's what I meant to say. Writing clearly would make it too easy for the readers...

          All the regression tests ran cleanly, so I'm checking Patch Available.

          Show
          Knut Anders Hatlen added a comment - Yes, that's what I meant to say. Writing clearly would make it too easy for the readers... All the regression tests ran cleanly, so I'm checking Patch Available.
          Hide
          Knut Anders Hatlen added a comment -

          Committed revision 816536.

          I plan to port the fix to the 10.5 branch, so I'll leave the issue open for now.

          Show
          Knut Anders Hatlen added a comment - Committed revision 816536. I plan to port the fix to the 10.5 branch, so I'll leave the issue open for now.
          Hide
          Knut Anders Hatlen added a comment -

          Merged the fix to the 10.5 branch and committed revision 818924.

          Show
          Knut Anders Hatlen added a comment - Merged the fix to the 10.5 branch and committed revision 818924.
          Hide
          Mamta A. Satoor added a comment -

          Merged changes into 10.4 codeline with revision 821226. The only error I got was for BlobClob4BlobTest.testLockingWithLongRowBlob and it looks like DERBY-3740 although my testing was on windows xp machine using IBM16 jdk

          1) testLockingBlob(org.apache.derbyTesting.functionTests.tests.jdbcapi.BlobClob4BlobTest)junit.framework.AssertionFailedError: FAIL - should have gotten lock timeout
          at org.apache.derbyTesting.functionTests.tests.jdbcapi.BlobClob4BlobTest.testLockingBlob(BlobClob4BlobTest.java:2413)
          at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
          at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
          at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:37)
          at org.apache.derbyTesting.junit.BaseTestCase.runBare(BaseTestCase.java:102)
          at junit.extensions.TestDecorator.basicRun(TestDecorator.java:22)
          at junit.extensions.TestSetup$1.protect(TestSetup.java:19)
          at junit.extensions.TestSetup.run(TestSetup.java:23)
          at org.apache.derbyTesting.junit.BaseTestSetup.run(BaseTestSetup.java:57)
          at junit.extensions.TestDecorator.basicRun(TestDecorator.java:22)
          at junit.extensions.TestSetup$1.protect(TestSetup.java:19)
          at junit.extensions.TestSetup.run(TestSetup.java:23)
          at org.apache.derbyTesting.junit.BaseTestSetup.run(BaseTestSetup.java:57)
          at junit.extensions.TestDecorator.basicRun(TestDecorator.java:22)
          at junit.extensions.TestSetup$1.protect(TestSetup.java:19)
          at junit.extensions.TestSetup.run(TestSetup.java:23)
          at junit.extensions.TestDecorator.basicRun(TestDecorator.java:22)
          at junit.extensions.TestSetup$1.protect(TestSetup.java:19)
          at junit.extensions.TestSetup.run(TestSetup.java:23)
          at org.apache.derbyTesting.junit.BaseTestSetup.run(BaseTestSetup.java:57)
          at junit.extensions.TestDecorator.basicRun(TestDecorator.java:22)
          at junit.extensions.TestSetup$1.protect(TestSetup.java:19)
          at junit.extensions.TestSetup.run(TestSetup.java:23)
          at org.apache.derbyTesting.junit.BaseTestSetup.run(BaseTestSetup.java:57)

          Show
          Mamta A. Satoor added a comment - Merged changes into 10.4 codeline with revision 821226. The only error I got was for BlobClob4BlobTest.testLockingWithLongRowBlob and it looks like DERBY-3740 although my testing was on windows xp machine using IBM16 jdk 1) testLockingBlob(org.apache.derbyTesting.functionTests.tests.jdbcapi.BlobClob4BlobTest)junit.framework.AssertionFailedError: FAIL - should have gotten lock timeout at org.apache.derbyTesting.functionTests.tests.jdbcapi.BlobClob4BlobTest.testLockingBlob(BlobClob4BlobTest.java:2413) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:37) at org.apache.derbyTesting.junit.BaseTestCase.runBare(BaseTestCase.java:102) at junit.extensions.TestDecorator.basicRun(TestDecorator.java:22) at junit.extensions.TestSetup$1.protect(TestSetup.java:19) at junit.extensions.TestSetup.run(TestSetup.java:23) at org.apache.derbyTesting.junit.BaseTestSetup.run(BaseTestSetup.java:57) at junit.extensions.TestDecorator.basicRun(TestDecorator.java:22) at junit.extensions.TestSetup$1.protect(TestSetup.java:19) at junit.extensions.TestSetup.run(TestSetup.java:23) at org.apache.derbyTesting.junit.BaseTestSetup.run(BaseTestSetup.java:57) at junit.extensions.TestDecorator.basicRun(TestDecorator.java:22) at junit.extensions.TestSetup$1.protect(TestSetup.java:19) at junit.extensions.TestSetup.run(TestSetup.java:23) at junit.extensions.TestDecorator.basicRun(TestDecorator.java:22) at junit.extensions.TestSetup$1.protect(TestSetup.java:19) at junit.extensions.TestSetup.run(TestSetup.java:23) at org.apache.derbyTesting.junit.BaseTestSetup.run(BaseTestSetup.java:57) at junit.extensions.TestDecorator.basicRun(TestDecorator.java:22) at junit.extensions.TestSetup$1.protect(TestSetup.java:19) at junit.extensions.TestSetup.run(TestSetup.java:23) at org.apache.derbyTesting.junit.BaseTestSetup.run(BaseTestSetup.java:57)
          Hide
          Mamta A. Satoor added a comment -

          Merged into 10.3 codeline with 821557.

          Show
          Mamta A. Satoor added a comment - Merged into 10.3 codeline with 821557.

            People

            • Assignee:
              Knut Anders Hatlen
              Reporter:
              Knut Anders Hatlen
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development