Derby
  1. Derby
  2. DERBY-5564

Code does different things depending if derby.locks.deadlockTrace=true is set

    Details

    • Issue & fix info:
      High Value Fix, Patch Available, Release Note Needed

      Description

      I see a problem in the code handling lock timeout exceptions. In the code in various places there are calls such as:

      // 2 kinds of errors here expected here. Either container not
      // found or could not obtain lock (LOCK_TIMEOUT or DEADLOCK).
      //
      // It is possible by the time this post commit work gets scheduled
      // that the container has been dropped and that the open container
      // call will return null - in this case just return assuming no
      // work to be done.

      if (se.getMessageId().equals(SQLState.LOCK_TIMEOUT) ||
      se.getMessageId().equals(SQLState.DEADLOCK))

      Or

      // First try to do the work in the nested transaction. Fail if we can't
      // get a lock immediately.
      if ( nestedTransaction != null )
      {
      try

      { return updateCurrentValueOnDisk( nestedTransaction, oldValue, newValue, false ); }

      catch (StandardException se)
      {
      if ( !se.getMessageId().equals( SQLState.LOCK_TIMEOUT ) )

      { throw se; }

      }

      Or

      // exception might have occured either container got dropper or lock not granted.
      // It is possible by the time this post commit work gets scheduled
      // that the container has been dropped and that the open container
      // call will return null - in this case just return assuming no
      // work to be done.

      //If this expcetion is because lock could not be obtained , work is requeued.
      if (se.getMessageId().equals(SQLState.LOCK_TIMEOUT) ||
      se.getMessageId().equals(SQLState.DEADLOCK))

      { requeue_work = true; }

      The problem that I see is that if the property "derby.locks.deadlockTrace=true" is set, then instead of a SQLState.LOCK_TIMEOUT, the code will see a SQLState.LOCK_TIMEOUT_LOG. Note that this is not being checked for in the above tests and others as well, so now the code behavior is going to change basd on whether the lock tracing is enabled or not.

      I think that 99% of the code that is testing for SQLState.LOCK_TIMEOUT should also be checking for SQLState.LOCK_TIMEOUT_LOG as well. I only see one place in DDLConstantAction where it is explicitly mentioned that SQLState.LOCK_TIMEOUT_LOG is not being looked at.

      1. releaseNote.html
        4 kB
        Kathey Marsden
      2. releaseNote.html
        4 kB
        Mike Matrigali
      3. releaseNote.html
        4 kB
        Mike Matrigali
      4. derby-5564_3.patch
        36 kB
        Mike Matrigali
      5. derby-5564_2.patch
        20 kB
        Mike Matrigali
      6. DERBY-5564.patch
        7 kB
        Brett Bergquist

        Issue Links

          Activity

          Brett Bergquist created issue -
          Hide
          Brett Bergquist added a comment -

          The attached patch adds a new "isLockStandardException" static method to StandardException that checks for either LOCK_TIMEOUT or LOCK_TIMEOUT_LOG and returns true if the exception is either one.

          The rest of the patch fixes the places in the code where only LOCK_TIMEOUT was being checked for to use this new method except for the change in EmbedXAResource.java which only has an SQLException to work with, not a StandardException.

          Show
          Brett Bergquist added a comment - The attached patch adds a new "isLockStandardException" static method to StandardException that checks for either LOCK_TIMEOUT or LOCK_TIMEOUT_LOG and returns true if the exception is either one. The rest of the patch fixes the places in the code where only LOCK_TIMEOUT was being checked for to use this new method except for the change in EmbedXAResource.java which only has an SQLException to work with, not a StandardException.
          Brett Bergquist made changes -
          Field Original Value New Value
          Attachment DERBY-5564.patch [ 12509300 ]
          Hide
          Kathey Marsden added a comment -

          It seems in general problematic that we throw exceptions with different SQLStates with derby.locks.deadlockTrace=true.

          String LOCK_TIMEOUT = "40XL1";
          String LOCK_TIMEOUT_LOG = "40XL2";

          Lock timeout is something that applications might generally check for and if they start getting 40XL2 instead 40XL2, just enabling the diagnostic property could change behavior which would make the diagnostics more frustrating than helpful.

          Show
          Kathey Marsden added a comment - It seems in general problematic that we throw exceptions with different SQLStates with derby.locks.deadlockTrace=true. String LOCK_TIMEOUT = "40XL1"; String LOCK_TIMEOUT_LOG = "40XL2"; Lock timeout is something that applications might generally check for and if they start getting 40XL2 instead 40XL2, just enabling the diagnostic property could change behavior which would make the diagnostics more frustrating than helpful.
          Hide
          Bryan Pendleton added a comment -

          +1 to both of the above comments:

          • I like the idea of a "isLockException()" method on StandardException so that
            code doesn't have to do getMessageId().equals()
          • I agree that it seems wrong that setting deadlockTrace=true causes a
            different exception to be thrown
          Show
          Bryan Pendleton added a comment - +1 to both of the above comments: I like the idea of a "isLockException()" method on StandardException so that code doesn't have to do getMessageId().equals() I agree that it seems wrong that setting deadlockTrace=true causes a different exception to be thrown
          Mike Matrigali made changes -
          Assignee Mike Matrigali [ mikem ]
          Hide
          Mike Matrigali added a comment -

          I will look at getting this fix in, but plan to change the patch such that only 40XL1 is returned in either case, and somehow the text of the error is changed based on the diagnostic level. This will likely change the error message and maybe number of arguments to the error message - I have not looked closely yet. Let me know if anyone is opposed to this approach.

          Show
          Mike Matrigali added a comment - I will look at getting this fix in, but plan to change the patch such that only 40XL1 is returned in either case, and somehow the text of the error is changed based on the diagnostic level. This will likely change the error message and maybe number of arguments to the error message - I have not looked closely yet. Let me know if anyone is opposed to this approach.
          Hide
          Brett Bergquist added a comment -

          This might break application clients that are looking for 40XL2 only but I think that would be remote. Probably if a client is setup to check for a 40XL2 it more than likely is also checking for a 40XL1 so this change would have no real affect other than leaving code in a client that will never be hit.

          I am not apposed to the change.

          Show
          Brett Bergquist added a comment - This might break application clients that are looking for 40XL2 only but I think that would be remote. Probably if a client is setup to check for a 40XL2 it more than likely is also checking for a 40XL1 so this change would have no real affect other than leaving code in a client that will never be hit. I am not apposed to the change.
          Hide
          Knut Anders Hatlen added a comment -

          Perhaps it could be as simple as changing LOCK_TIMEOUT_LOG to 40XL1.S.1 (and changing the corresponding entry in messages.xml), and checking getSQLState() instead of getMessageId() (since the former strips off the .S.1 suffix).

          Show
          Knut Anders Hatlen added a comment - Perhaps it could be as simple as changing LOCK_TIMEOUT_LOG to 40XL1.S.1 (and changing the corresponding entry in messages.xml), and checking getSQLState() instead of getMessageId() (since the former strips off the .S.1 suffix).
          Hide
          Mike Matrigali added a comment -

          Would Knut's proposed change work with existing user applications? It seems like a better solution if it works with existing applications. I believe the SQL state is part of the standard, so seems like it should work.

          Show
          Mike Matrigali added a comment - Would Knut's proposed change work with existing user applications? It seems like a better solution if it works with existing applications. I believe the SQL state is part of the standard, so seems like it should work.
          Hide
          Knut Anders Hatlen added a comment -

          Applications typically check the SQLState, not the message id. To get the message id, they'd first need to unwrap the SQLException to get the underlying StandardException, and StandardException is not part of the public API. So I don't think it would affect existing applications. Except, as Brett mentioned, if the applications check for SQLState 40XL2 without also checking for 40XL1. We should probably mention that in the release notes.

          Show
          Knut Anders Hatlen added a comment - Applications typically check the SQLState, not the message id. To get the message id, they'd first need to unwrap the SQLException to get the underlying StandardException, and StandardException is not part of the public API. So I don't think it would affect existing applications. Except, as Brett mentioned, if the applications check for SQLState 40XL2 without also checking for 40XL1. We should probably mention that in the release notes.
          Hide
          Knut Anders Hatlen added a comment -

          By the way, I think the severity suffix should be .T, not .S as suggested above, since lock timeouts are transaction severity errors.

          Show
          Knut Anders Hatlen added a comment - By the way, I think the severity suffix should be .T, not .S as suggested above, since lock timeouts are transaction severity errors.
          Hide
          Mike Matrigali added a comment -

          marking that a release note should be included noting that a different error code will be returned in the case of a timeout when extra diagnostics are enabled. proposal is to return different text but to return the same LOCK_TIMEOUT sql state in both cases.

          Show
          Mike Matrigali added a comment - marking that a release note should be included noting that a different error code will be returned in the case of a timeout when extra diagnostics are enabled. proposal is to return different text but to return the same LOCK_TIMEOUT sql state in both cases.
          Mike Matrigali made changes -
          Issue & fix info High Value Fix,Patch Available,Release Note Needed [ 10422,10102,10101 ]
          Hide
          Mike Matrigali added a comment -

          In DDLConstantAction.java is the following comment. After my proposed change there will be no difference between the 2 lock timeout cases, which I think is ok, but wanted to point it out:

          if (se.getMessageId().equals(SQLState.LOCK_TIMEOUT)) {
          // We don't test for SQLState.DEADLOCK or
          // .LOCK_TIMEOUT_LOG here because a) if it is a
          // deadlock, it may be better to expose it, and b)
          // LOCK_TIMEOUT_LOG happens when the app has set
          // derby.locks.deadlockTrace=true, in which case we
          // don't want to mask the timeout. So in both the
          // latter cases we just throw.
          if (useTc == nestedTc)

          { // clean up after use of nested transaction, // then try again in outer transaction useTc = tc; nestedTc.destroy(); continue; }
          Show
          Mike Matrigali added a comment - In DDLConstantAction.java is the following comment. After my proposed change there will be no difference between the 2 lock timeout cases, which I think is ok, but wanted to point it out: if (se.getMessageId().equals(SQLState.LOCK_TIMEOUT)) { // We don't test for SQLState.DEADLOCK or // .LOCK_TIMEOUT_LOG here because a) if it is a // deadlock, it may be better to expose it, and b) // LOCK_TIMEOUT_LOG happens when the app has set // derby.locks.deadlockTrace=true, in which case we // don't want to mask the timeout. So in both the // latter cases we just throw. if (useTc == nestedTc) { // clean up after use of nested transaction, // then try again in outer transaction useTc = tc; nestedTc.destroy(); continue; }
          Hide
          Bryan Pendleton added a comment -

          +1 to using SQLState rather than MessageId.

          Show
          Bryan Pendleton added a comment - +1 to using SQLState rather than MessageId.
          Hide
          Mike Matrigali added a comment -

          I have updated the previous patch to include test changes, and have changed LOCK_TIMEOUT_LOG to return the same SQL state as LOCK_TIMEOUT as suggested. I have left the behavior of DDLConstantAction with reguard to LOCK_TIMEOUT_LOG the same, so it will still immediately throw an exception in this case rather than retry.

          A patch very similar to this has passed all tests except for failures in NativeAuthProcs which consistently fail on my machine against a clean codeline on the trunk, i will run a full set of tests on this patch tonight.

          Show
          Mike Matrigali added a comment - I have updated the previous patch to include test changes, and have changed LOCK_TIMEOUT_LOG to return the same SQL state as LOCK_TIMEOUT as suggested. I have left the behavior of DDLConstantAction with reguard to LOCK_TIMEOUT_LOG the same, so it will still immediately throw an exception in this case rather than retry. A patch very similar to this has passed all tests except for failures in NativeAuthProcs which consistently fail on my machine against a clean codeline on the trunk, i will run a full set of tests on this patch tonight.
          Mike Matrigali made changes -
          Attachment derby-5564_2.patch [ 12509954 ]
          Hide
          Knut Anders Hatlen added a comment -

          The patch looks fine to me. You might want to update some more references to the old state, though:

          • LazyDefaultSchemaCreationTest: the constant LOCK_TIMEOUT_LOG is no longer used and could be removed
          • StressTestMulti: lists 40XL2 as a state to ignore in a couple of places
          • the translated messages still use the old message id

          Nit: The method name isLockTimeoutOrLockDeadlockStandardException() is a mouthful. Perhaps something simpler, like isLockTimeoutOrDeadlock(), would be just as good. (And similar: isLockTimeoutStandardException() -> isLockTimeout())

          Show
          Knut Anders Hatlen added a comment - The patch looks fine to me. You might want to update some more references to the old state, though: LazyDefaultSchemaCreationTest: the constant LOCK_TIMEOUT_LOG is no longer used and could be removed StressTestMulti: lists 40XL2 as a state to ignore in a couple of places the translated messages still use the old message id Nit: The method name isLockTimeoutOrLockDeadlockStandardException() is a mouthful. Perhaps something simpler, like isLockTimeoutOrDeadlock(), would be just as good. (And similar: isLockTimeoutStandardException() -> isLockTimeout())
          Hide
          Mike Matrigali added a comment -

          I have not previously ever made a change to translated error message files. Is there any special considerations with editors an such to not corrupting them? I assume I just need to change all the 40XL2 references to 40XL1.T.1

          Show
          Mike Matrigali added a comment - I have not previously ever made a change to translated error message files. Is there any special considerations with editors an such to not corrupting them? I assume I just need to change all the 40XL2 references to 40XL1.T.1
          Hide
          Knut Anders Hatlen added a comment -

          The message files only contain plain ascii characters, so any text editor that handles ascii should be fine. Or use perl:

          perl -pi -e s/40XL2/40XL1.T.1/g messages*.properties

          Show
          Knut Anders Hatlen added a comment - The message files only contain plain ascii characters, so any text editor that handles ascii should be fine. Or use perl: perl -pi -e s/40XL2/40XL1.T.1/g messages*.properties
          Hide
          Mike Matrigali added a comment -

          I believe this patch addresses Knut's review comments. Still have to run nightly tests, so not ready for review. My last run saw some issues in upgrade, which may be cascade from errors in NativeAuth test failures I am seeing
          on my laptop. So working through those first.

          Show
          Mike Matrigali added a comment - I believe this patch addresses Knut's review comments. Still have to run nightly tests, so not ready for review. My last run saw some issues in upgrade, which may be cascade from errors in NativeAuth test failures I am seeing on my laptop. So working through those first.
          Mike Matrigali made changes -
          Attachment derby-5564_3.patch [ 12510133 ]
          Hide
          Knut Anders Hatlen added a comment -

          Thanks, Mike. The #3 patch addresses all my comments. +1 to commit if the tests pass.

          Show
          Knut Anders Hatlen added a comment - Thanks, Mike. The #3 patch addresses all my comments. +1 to commit if the tests pass.
          Hide
          Mike Matrigali added a comment -

          All tests passed, committed to trunk:
          s1_ibm16:18>svn commit

          Sending java\engine\org\apache\derby\iapi\error\StandardException.java
          Sending java\engine\org\apache\derby\iapi\sql\dictionary\SPSDescriptor.java
          Sending java\engine\org\apache\derby\impl\services\daemon\IndexStatisticsDaemonImpl.java
          Sending java\engine\org\apache\derby\impl\sql\catalog\SequenceUpdater.java
          Sending java\engine\org\apache\derby\impl\sql\execute\DDLConstantAction.java
          Sending java\engine\org\apache\derby\impl\sql\execute\LockTableConstantAction.java
          Sending java\engine\org\apache\derby\impl\store\access\btree\BTreePostCommit.java
          Sending java\engine\org\apache\derby\impl\store\access\heap\HeapPostCommit.java
          Sending java\engine\org\apache\derby\impl\store\raw\data\ReclaimSpaceHelper.java
          Sending java\engine\org\apache\derby\impl\store\raw\xact\Xact.java
          Sending java\engine\org\apache\derby\jdbc\XATransactionState.java
          Sending java\engine\org\apache\derby\loc\messages.xml
          Sending java\engine\org\apache\derby\loc\messages_cs.properties
          Sending java\engine\org\apache\derby\loc\messages_de_DE.properties
          Sending java\engine\org\apache\derby\loc\messages_es.properties
          Sending java\engine\org\apache\derby\loc\messages_fr.properties
          Sending java\engine\org\apache\derby\loc\messages_hu.properties
          Sending java\engine\org\apache\derby\loc\messages_it.properties
          Sending java\engine\org\apache\derby\loc\messages_ja_JP.properties
          Sending java\engine\org\apache\derby\loc\messages_ko_KR.properties
          Sending java\engine\org\apache\derby\loc\messages_pl.properties
          Sending java\engine\org\apache\derby\loc\messages_pt_BR.properties
          Sending java\engine\org\apache\derby\loc\messages_ru.properties
          Sending java\engine\org\apache\derby\loc\messages_zh_CN.properties
          Sending java\engine\org\apache\derby\loc\messages_zh_TW.properties
          Sending java\shared\org\apache\derby\shared\common\reference\SQLState.java
          Sending java\testing\org\apache\derbyTesting\functionTests\tests\lang\ErrorMessageTest.java
          Sending java\testing\org\apache\derbyTesting\functionTests\tests\lang\LazyDefaultSchemaCreationTest.java
          Transmitting file data ............................
          Committed revision 1230100.

          Show
          Mike Matrigali added a comment - All tests passed, committed to trunk: s1_ibm16:18>svn commit Sending java\engine\org\apache\derby\iapi\error\StandardException.java Sending java\engine\org\apache\derby\iapi\sql\dictionary\SPSDescriptor.java Sending java\engine\org\apache\derby\impl\services\daemon\IndexStatisticsDaemonImpl.java Sending java\engine\org\apache\derby\impl\sql\catalog\SequenceUpdater.java Sending java\engine\org\apache\derby\impl\sql\execute\DDLConstantAction.java Sending java\engine\org\apache\derby\impl\sql\execute\LockTableConstantAction.java Sending java\engine\org\apache\derby\impl\store\access\btree\BTreePostCommit.java Sending java\engine\org\apache\derby\impl\store\access\heap\HeapPostCommit.java Sending java\engine\org\apache\derby\impl\store\raw\data\ReclaimSpaceHelper.java Sending java\engine\org\apache\derby\impl\store\raw\xact\Xact.java Sending java\engine\org\apache\derby\jdbc\XATransactionState.java Sending java\engine\org\apache\derby\loc\messages.xml Sending java\engine\org\apache\derby\loc\messages_cs.properties Sending java\engine\org\apache\derby\loc\messages_de_DE.properties Sending java\engine\org\apache\derby\loc\messages_es.properties Sending java\engine\org\apache\derby\loc\messages_fr.properties Sending java\engine\org\apache\derby\loc\messages_hu.properties Sending java\engine\org\apache\derby\loc\messages_it.properties Sending java\engine\org\apache\derby\loc\messages_ja_JP.properties Sending java\engine\org\apache\derby\loc\messages_ko_KR.properties Sending java\engine\org\apache\derby\loc\messages_pl.properties Sending java\engine\org\apache\derby\loc\messages_pt_BR.properties Sending java\engine\org\apache\derby\loc\messages_ru.properties Sending java\engine\org\apache\derby\loc\messages_zh_CN.properties Sending java\engine\org\apache\derby\loc\messages_zh_TW.properties Sending java\shared\org\apache\derby\shared\common\reference\SQLState.java Sending java\testing\org\apache\derbyTesting\functionTests\tests\lang\ErrorMessageTest.java Sending java\testing\org\apache\derbyTesting\functionTests\tests\lang\LazyDefaultSchemaCreationTest.java Transmitting file data ............................ Committed revision 1230100.
          Mike Matrigali made changes -
          Fix Version/s 10.9.0.0 [ 12316344 ]
          Hide
          Mike Matrigali added a comment -

          adding release note for the issue.

          Show
          Mike Matrigali added a comment - adding release note for the issue.
          Mike Matrigali made changes -
          Attachment releaseNote.html [ 12510271 ]
          Hide
          Mike Matrigali added a comment -

          I am planning on backporting this fix, let me know if you don't think it should be.

          Show
          Mike Matrigali added a comment - I am planning on backporting this fix, let me know if you don't think it should be.
          Hide
          Mike Matrigali added a comment -

          issue has existed at least as far back as 10.1

          Show
          Mike Matrigali added a comment - issue has existed at least as far back as 10.1
          Mike Matrigali made changes -
          Affects Version/s 10.7.1.1 [ 12315564 ]
          Affects Version/s 10.6.2.1 [ 12315343 ]
          Affects Version/s 10.5.3.0 [ 12314117 ]
          Affects Version/s 10.4.2.0 [ 12313345 ]
          Affects Version/s 10.3.3.0 [ 12313142 ]
          Affects Version/s 10.2.2.0 [ 12312027 ]
          Affects Version/s 10.1.3.1 [ 12311953 ]
          Hide
          Dag H. Wanvik added a comment -

          Thanks for the release notes. Looks good. Nit: the last sentence in that rationale looked weird, maybe it had an extraneous period inside it?

          Show
          Dag H. Wanvik added a comment - Thanks for the release notes. Looks good. Nit: the last sentence in that rationale looked weird, maybe it had an extraneous period inside it?
          Hide
          Mike Matrigali added a comment -

          removed extraneous text noted by review comments.

          Show
          Mike Matrigali added a comment - removed extraneous text noted by review comments.
          Mike Matrigali made changes -
          Attachment releaseNote.html [ 12510387 ]
          Hide
          Mike Matrigali added a comment -

          backported change to 10.8.

          Show
          Mike Matrigali added a comment - backported change to 10.8.
          Mike Matrigali made changes -
          Fix Version/s 10.8.2.3 [ 12318540 ]
          Hide
          Mike Matrigali added a comment -

          fixed issue and backported from trunk to 10.8, 10.7, 10.6 and 10.5 branches.

          Show
          Mike Matrigali added a comment - fixed issue and backported from trunk to 10.8, 10.7, 10.6 and 10.5 branches.
          Mike Matrigali made changes -
          Status Open [ 1 ] Resolved [ 5 ]
          Fix Version/s 10.5.3.2 [ 12315436 ]
          Fix Version/s 10.6.2.3 [ 12315434 ]
          Fix Version/s 10.7.1.4 [ 12315902 ]
          Resolution Fixed [ 1 ]
          Knut Anders Hatlen made changes -
          Link This issue is related to DERBY-5961 [ DERBY-5961 ]
          Hide
          Kathey Marsden added a comment -

          Updating release note summary per Rick's recommendation

          Show
          Kathey Marsden added a comment - Updating release note summary per Rick's recommendation
          Kathey Marsden made changes -
          Attachment releaseNote.html [ 12551736 ]
          Kathey Marsden made changes -
          Fix Version/s 10.8.3.0 [ 12323456 ]
          Fix Version/s 10.8.2.3 [ 12318540 ]
          Gavin made changes -
          Workflow jira [ 12647768 ] Default workflow, editable Closed status [ 12802120 ]
          Hide
          Knut Anders Hatlen added a comment -

          [bulk update: close all resolved issues that haven't had any activity the last year]

          Show
          Knut Anders Hatlen added a comment - [bulk update: close all resolved issues that haven't had any activity the last year]
          Knut Anders Hatlen made changes -
          Status Resolved [ 5 ] Closed [ 6 ]

            People

            • Assignee:
              Mike Matrigali
              Reporter:
              Brett Bergquist
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development