Derby
  1. Derby
  2. DERBY-2432

Unimplemented transaction time out for XA transactions may cause that locks will not be released when client terminates outside a unit of work.

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 10.3.1.4
    • Component/s: JDBC
    • Labels:
      None
    • Urgency:
      Normal

      Description

      The XAResource interface provides function setTransactionTimeout which is currently not supported in derby.

      When client application uses client driver to connect to derby database and the application crashes outside the unit of work of XA transaction and the transaction is not committed or rolled back yet the locks held by the transaction will not be released.

      1. d2432_v5.stat
        0.9 kB
        Julius Stroffek
      2. d2432_v5.diff
        41 kB
        Julius Stroffek
      3. d2432_v4.stat
        0.9 kB
        Julius Stroffek
      4. d2432_v4.diff
        41 kB
        Julius Stroffek
      5. d2432_v3.stat
        0.9 kB
        Julius Stroffek
      6. d2432_v3.diff
        40 kB
        Julius Stroffek
      7. d2432_v2.stat
        0.9 kB
        Julius Stroffek
      8. d2432_v2.diff
        40 kB
        Julius Stroffek
      9. d2432.stat
        0.9 kB
        Julius Stroffek
      10. d2432.diff
        38 kB
        Julius Stroffek
      11. description.txt
        3 kB
        Julius Stroffek

        Issue Links

          Activity

          Hide
          Julius Stroffek added a comment -

          There is a commented code in NetXAResource.setTransactionTimeout which contains a call to non-existing method xaSetTransTimeOut. Anybody knows the origin of this code? Why it is still there?

          the function code is:

          public boolean setTransactionTimeout(int seconds) throws XAException {
          if (conn_.agent_.loggingEnabled())

          { conn_.agent_.logWriter_.traceExit(this, "setTransactionTimeout", false); }

          exceptionsOnXA = null;
          return false; // we don't support transaction timeout in our layer.
          /* int rc = xaSetTransTimeOut(seconds);
          if (rc != XAResource.XA_OK)
          throwXAException(rc); */
          }

          Show
          Julius Stroffek added a comment - There is a commented code in NetXAResource.setTransactionTimeout which contains a call to non-existing method xaSetTransTimeOut. Anybody knows the origin of this code? Why it is still there? the function code is: — public boolean setTransactionTimeout(int seconds) throws XAException { if (conn_.agent_.loggingEnabled()) { conn_.agent_.logWriter_.traceExit(this, "setTransactionTimeout", false); } exceptionsOnXA = null; return false; // we don't support transaction timeout in our layer. /* int rc = xaSetTransTimeOut(seconds); if (rc != XAResource.XA_OK) throwXAException(rc); */ }
          Hide
          Julius Stroffek added a comment -

          I ran derbyall and suites.All without failures.

          The description of the patch is in description.txt file. There are some more explanations in DERBY-2508 sub-task.

          Please, review the patch and ask questions or post comments.

          Thanks.

          Show
          Julius Stroffek added a comment - I ran derbyall and suites.All without failures. The description of the patch is in description.txt file. There are some more explanations in DERBY-2508 sub-task. Please, review the patch and ask questions or post comments. Thanks.
          Hide
          Julius Stroffek added a comment -

          The patch contains all the changes for sub-tasks DERBY-2508 and DERBY-2509.

          Show
          Julius Stroffek added a comment - The patch contains all the changes for sub-tasks DERBY-2508 and DERBY-2509 .
          Hide
          Knut Anders Hatlen added a comment -

          Hi Julo,

          I have started reviewing the patch, but I haven't studied the entire patch in detail yet. Here are some preliminary comments and questions:

          XATransactionState.java:

          • it would be good if the new methods had javadoc comments
          • perhaps the common code in xa_rollback() and xa_commit() could be factored out into a utility method, say cancelCancellationTask()?
          • I think there is a possibility that CancelXATransactionTask tries to roll back the transaction after an explicit rollback or commit has been issued. This is because CancelXATransactionTask.run() and XATransactionState.xa_commit()/xa_rollback() synchronize on different objects.
          • would it be more natural to have CancelXATransactionTask as a non-static class? Then we wouldn't need the tranState variable (it could be accessed with XATransactionState.this).
          • I'm not sure I understand why scheduleTimeoutTask() can be unsynchronized whereas all the other methods require synchronization of some kind. Could you please explain?
          • if CancelXATransactionTask.run() catches an exception, it prints it to the console. I'm not sure what's the best way to handle these exceptions, but I don't think printing them is appropriate. Perhaps it would be better to use Monitor.logMessage() (or perhaps create a Monitor.logThrowable() which could print the stack trace)?
          • XATransactionState.cancel() catches SQLException and re-throws it as XAException. Do you think we could chain the exceptions with initCause() so that we preserve the original error?
          Show
          Knut Anders Hatlen added a comment - Hi Julo, I have started reviewing the patch, but I haven't studied the entire patch in detail yet. Here are some preliminary comments and questions: XATransactionState.java: it would be good if the new methods had javadoc comments perhaps the common code in xa_rollback() and xa_commit() could be factored out into a utility method, say cancelCancellationTask()? I think there is a possibility that CancelXATransactionTask tries to roll back the transaction after an explicit rollback or commit has been issued. This is because CancelXATransactionTask.run() and XATransactionState.xa_commit()/xa_rollback() synchronize on different objects. would it be more natural to have CancelXATransactionTask as a non-static class? Then we wouldn't need the tranState variable (it could be accessed with XATransactionState.this). I'm not sure I understand why scheduleTimeoutTask() can be unsynchronized whereas all the other methods require synchronization of some kind. Could you please explain? if CancelXATransactionTask.run() catches an exception, it prints it to the console. I'm not sure what's the best way to handle these exceptions, but I don't think printing them is appropriate. Perhaps it would be better to use Monitor.logMessage() (or perhaps create a Monitor.logThrowable() which could print the stack trace)? XATransactionState.cancel() catches SQLException and re-throws it as XAException. Do you think we could chain the exceptions with initCause() so that we preserve the original error?
          Hide
          Julius Stroffek added a comment -

          Thanks Knut for spending some time with a patch.

          > XATransactionState.java:
          >
          > - it would be good if the new methods had javadoc comments

          I'll add missing javadoc.

          >
          > - perhaps the common code in xa_rollback() and xa_commit() could be factored out into a utility method, say cancelCancellationTask()?

          I thought that it is too simple to be a method.

          > - I think there is a possibility that CancelXATransactionTask tries to roll back the transaction after an explicit rollback or commit has been issued. This is because CancelXATransactionTask.run() and XATransactionState.xa_commit()/xa_rollback() synchronize on different objects.

          Yes, it is. But it does not succeed. If I will change the CancelXATransactionTask class to non-static just the same synchronization might not help. It is not documented what the TimerTask.cancel method does when the task is running (and waiting for the synchronization on run method). How can I synchronize on the same objects if some of those might be null (XATransactionState.timeoutTask) or might be null after synchronization (CancelXATransactionTask.run method)? I think that maintaining a variable whether the transaction was committed/rolled back already might be appropriate. I can check that variable in XATransactionState.cancel method.

          > - would it be more natural to have CancelXATransactionTask as a non-static class? Then we wouldn't need the tranState variable (it could be accessed with XATransactionState.this).

          I copied the way of the transaction timeout from org.apache.derby.impl.sql.conn.GenericStatementContext where the CancelTask is made static. I can not see

          > - I'm not sure I understand why scheduleTimeoutTask() can be unsynchronized whereas all the other methods require synchronization of some kind. Could you please explain?

          I wanted to avoid unnecessary synchronization. scheduleTimeoutTask is called just from EmbedXAResource.start which is synchornized on EmbedXAResource. You can not execute any other statements/functions on behalf of that global transaction in parallel or more precisely said all of those paralel invocations (except one) will fail.

          This should be explained in comment but looking again at this I think it is much more transparent to add a synchronization on scheduleTimeoutTask since the function is not heavily invoked...

          > - if CancelXATransactionTask.run() catches an exception, it prints it to the console. I'm not sure what's the best way to handle these exceptions, but I don't think printing them is appropriate. Perhaps it would be better to use Monitor.logMessage() (or perhaps create a Monitor.logThrowable() which could print the stack trace)?

          I had no idea about correct handling either. I made it to print it to the console just temporary but I forgot to think about he correct handling. I'll add a method Monitor.logThrowable as you are suggesting.

          > - XATransactionState.cancel() catches SQLException and re-throws it as XAException. Do you think we could chain the exceptions with initCause() so that we preserve the original error?

          Probably yes, because this exception will end in Monitor.logThrowable as you are suggesting.

          I'll prepare a new version of the patch ASAP. Thanks.

          Show
          Julius Stroffek added a comment - Thanks Knut for spending some time with a patch. > XATransactionState.java: > > - it would be good if the new methods had javadoc comments I'll add missing javadoc. > > - perhaps the common code in xa_rollback() and xa_commit() could be factored out into a utility method, say cancelCancellationTask()? I thought that it is too simple to be a method. > - I think there is a possibility that CancelXATransactionTask tries to roll back the transaction after an explicit rollback or commit has been issued. This is because CancelXATransactionTask.run() and XATransactionState.xa_commit()/xa_rollback() synchronize on different objects. Yes, it is. But it does not succeed. If I will change the CancelXATransactionTask class to non-static just the same synchronization might not help. It is not documented what the TimerTask.cancel method does when the task is running (and waiting for the synchronization on run method). How can I synchronize on the same objects if some of those might be null (XATransactionState.timeoutTask) or might be null after synchronization (CancelXATransactionTask.run method)? I think that maintaining a variable whether the transaction was committed/rolled back already might be appropriate. I can check that variable in XATransactionState.cancel method. > - would it be more natural to have CancelXATransactionTask as a non-static class? Then we wouldn't need the tranState variable (it could be accessed with XATransactionState.this). I copied the way of the transaction timeout from org.apache.derby.impl.sql.conn.GenericStatementContext where the CancelTask is made static. I can not see > - I'm not sure I understand why scheduleTimeoutTask() can be unsynchronized whereas all the other methods require synchronization of some kind. Could you please explain? I wanted to avoid unnecessary synchronization. scheduleTimeoutTask is called just from EmbedXAResource.start which is synchornized on EmbedXAResource. You can not execute any other statements/functions on behalf of that global transaction in parallel or more precisely said all of those paralel invocations (except one) will fail. This should be explained in comment but looking again at this I think it is much more transparent to add a synchronization on scheduleTimeoutTask since the function is not heavily invoked... > - if CancelXATransactionTask.run() catches an exception, it prints it to the console. I'm not sure what's the best way to handle these exceptions, but I don't think printing them is appropriate. Perhaps it would be better to use Monitor.logMessage() (or perhaps create a Monitor.logThrowable() which could print the stack trace)? I had no idea about correct handling either. I made it to print it to the console just temporary but I forgot to think about he correct handling. I'll add a method Monitor.logThrowable as you are suggesting. > - XATransactionState.cancel() catches SQLException and re-throws it as XAException. Do you think we could chain the exceptions with initCause() so that we preserve the original error? Probably yes, because this exception will end in Monitor.logThrowable as you are suggesting. I'll prepare a new version of the patch ASAP. Thanks.
          Hide
          Julius Stroffek added a comment -

          I have created a new patch ran XATransactionTest without any errors or failures. Now I am running suites.All and derbyall.

          I have changed the following:

          • Function Monitor.logThrowable created and I am reporting exceptions thrown during a scheduled rollback using that function.
          • A member isFinished created in XATransactionState which indicates that the transaction was committed or rolled back and this value is checked in 'cancel' method and nothing is done if the transaction was already committed or rolled back.
          • Missing javadoc comments were added.
          • The class CancelXATransactionTask changed to be non-static.
          • synchronize added to scheduleTimeoutTask method
          • call to initCause added in XATransactionState.cancel()
          Show
          Julius Stroffek added a comment - I have created a new patch ran XATransactionTest without any errors or failures. Now I am running suites.All and derbyall. I have changed the following: Function Monitor.logThrowable created and I am reporting exceptions thrown during a scheduled rollback using that function. A member isFinished created in XATransactionState which indicates that the transaction was committed or rolled back and this value is checked in 'cancel' method and nothing is done if the transaction was already committed or rolled back. Missing javadoc comments were added. The class CancelXATransactionTask changed to be non-static. synchronize added to scheduleTimeoutTask method call to initCause added in XATransactionState.cancel()
          Hide
          Knut Anders Hatlen added a comment -

          Thanks Julo, your answers make sense.

          A couple of more questions:

          Is it specified anywhere what the transaction timeout means? I found a
          description in the javadoc for XAResource.setTransactionTimeout(), but
          it doesn't say anything about the effect of the timeout.

          The property is called derby.jdbc.xaTransactionTimeout. Is it related
          to JDBC, or should we call it something else, like
          derby.xa.transactionTimeout?

          Perhaps EmbedXAResource.start() could have a comment explaining why it
          checks (timeoutSeconds != Integer.MAX_VALUE).

          I'm not sure I quite understand what the test does. For instance, it's
          not clear to me which statements/transactions should time out and what
          makes them time out. Also, would it be possible to split the test into
          smaller/simpler test cases, like (a) set transaction timeout, (b)
          start transaction, (c) sleep for a while, (d) check that the
          transaction was aborted?

          In the test, the variables xaConn, xaRes and conn are declared and
          initialized right before the for loop, but also at the beginning of
          the body of the for loop, so their initial values are always thrown
          away. Could the variable declarations be moved inside the for loop
          instead?

          In this try/catch in the test, shouldn't there have been a call to
          fail() after xaRes.end()?
          + try

          { + xaRes.end(xid, XAResource.TMFAIL); + }

          catch (XAException ex) {
          + if (ex.errorCode < XAException.XA_RBBASE
          + || ex.errorCode > XAException.XA_RBEND)
          +

          { + throw ex; + }

          + }

          I'm also not sure I understand the following code
          + } else

          { + // check the timout for associated transactions + ; + }

          It says it checks the timeout, but it doesn't do anything.

          Show
          Knut Anders Hatlen added a comment - Thanks Julo, your answers make sense. A couple of more questions: Is it specified anywhere what the transaction timeout means? I found a description in the javadoc for XAResource.setTransactionTimeout(), but it doesn't say anything about the effect of the timeout. The property is called derby.jdbc.xaTransactionTimeout. Is it related to JDBC, or should we call it something else, like derby.xa.transactionTimeout? Perhaps EmbedXAResource.start() could have a comment explaining why it checks (timeoutSeconds != Integer.MAX_VALUE). I'm not sure I quite understand what the test does. For instance, it's not clear to me which statements/transactions should time out and what makes them time out. Also, would it be possible to split the test into smaller/simpler test cases, like (a) set transaction timeout, (b) start transaction, (c) sleep for a while, (d) check that the transaction was aborted? In the test, the variables xaConn, xaRes and conn are declared and initialized right before the for loop, but also at the beginning of the body of the for loop, so their initial values are always thrown away. Could the variable declarations be moved inside the for loop instead? In this try/catch in the test, shouldn't there have been a call to fail() after xaRes.end()? + try { + xaRes.end(xid, XAResource.TMFAIL); + } catch (XAException ex) { + if (ex.errorCode < XAException.XA_RBBASE + || ex.errorCode > XAException.XA_RBEND) + { + throw ex; + } + } I'm also not sure I understand the following code + } else { + // check the timout for associated transactions + ; + } It says it checks the timeout, but it doesn't do anything.
          Hide
          Knut Anders Hatlen added a comment -

          I looked at the v2 patch. Some additional comments:

          I think Monitor.logThrowable() could be simpler:
          t.printStackTrace(getStream().getPrintWriter());

          The tranState variable in CancelXATransactionTask is not used.

          I believe the synchronization in CancelXATransactionTask.run() is not necessary as this method is the only one that uses that synchronization object, and its only call is to a synchronized method anyway.

          Show
          Knut Anders Hatlen added a comment - I looked at the v2 patch. Some additional comments: I think Monitor.logThrowable() could be simpler: t.printStackTrace(getStream().getPrintWriter()); The tranState variable in CancelXATransactionTask is not used. I believe the synchronization in CancelXATransactionTask.run() is not necessary as this method is the only one that uses that synchronization object, and its only call is to a synchronized method anyway.
          Hide
          Julius Stroffek added a comment -

          > Is it specified anywhere what the transaction timeout means? I found a
          > description in the javadoc for XAResource.setTransactionTimeout(), but
          > it doesn't say anything about the effect of the timeout.

          No, there is no exact specification what timeout means.

          > The property is called derby.jdbc.xaTransactionTimeout. Is it related
          > to JDBC, or should we call it something else, like
          > derby.xa.transactionTimeout?

          Yes, it is the 'default value' about which the javadoc
          of XAResource.setTransactionTimeout is talking about.
          And it's from JDBC specification.

          > I'm not sure I quite understand what the test does. For instance, it's
          > not clear to me which statements/transactions should time out and what
          > makes them time out. Also, would it be possible to split the test into
          > smaller/simpler test cases, like (a) set transaction timeout, (b)
          > start transaction, (c) sleep for a while, (d) check that the
          > transaction was aborted?

          Yes, that would be possible. Currently, 1000 global transactions
          are performed during the test. Everyone of them just inserts a row
          into XATT table. The rows inserted by the transactions are different.
          Some of these transactions are committed and some of them are
          left in different stages. After finishing these 1000 transactions a select
          statement is executed on that table. However, if there are still some
          unfinished transactions that were not aborted they will hold a lock
          on a XATT table until they will get rolled back by the transaction timeout.
          The number of rows in the XATT table is calculated. It is then compared
          with the excepted number of rows (the trnsaction we know we have
          committed).

          The call
          xaRes.setTransactionTimeout(5);
          before calling xaRes.start() makes the transactions to get timed out.

          Is this understandable? Would it be ok to explain this more in comments?

          > I'm also not sure I understand the following code
          > + } else

          { > + // check the timout for associated transactions > + ; > + }


          > It says it checks the timeout, but it doesn't do anything.

          The time out is setted up for the transaction at the beggining
          of the transaction. Different transactions are left in different
          states as the timeout task will try to roll back transactions
          in different state / connection association, etc. The code
          actually does not test the timeout for associated transactions
          directly, it just left the transaction in a state it is (and it is
          associated with the connection).

          > In this try/catch in the test, shouldn't there have been a call to
          > fail() after xaRes.end()?
          > + try

          { > + xaRes.end(xid, XAResource.TMFAIL); > + }

          catch (XAException ex) {
          > + if (ex.errorCode < XAException.XA_RBBASE
          > + || ex.errorCode > XAException.XA_RBEND)
          > +

          { > + throw ex; > + }


          > + }

          Might be, it will then test something more than just a timeout feature.

          > In the test, the variables xaConn, xaRes and conn are declared and
          > initialized right before the for loop, but also at the beginning of
          > the body of the for loop, so their initial values are always thrown
          > away. Could the variable declarations be moved inside the for loop
          > instead?

          Yes, i missed that.

          Show
          Julius Stroffek added a comment - > Is it specified anywhere what the transaction timeout means? I found a > description in the javadoc for XAResource.setTransactionTimeout(), but > it doesn't say anything about the effect of the timeout. No, there is no exact specification what timeout means. > The property is called derby.jdbc.xaTransactionTimeout. Is it related > to JDBC, or should we call it something else, like > derby.xa.transactionTimeout? Yes, it is the 'default value' about which the javadoc of XAResource.setTransactionTimeout is talking about. And it's from JDBC specification. > I'm not sure I quite understand what the test does. For instance, it's > not clear to me which statements/transactions should time out and what > makes them time out. Also, would it be possible to split the test into > smaller/simpler test cases, like (a) set transaction timeout, (b) > start transaction, (c) sleep for a while, (d) check that the > transaction was aborted? Yes, that would be possible. Currently, 1000 global transactions are performed during the test. Everyone of them just inserts a row into XATT table. The rows inserted by the transactions are different. Some of these transactions are committed and some of them are left in different stages. After finishing these 1000 transactions a select statement is executed on that table. However, if there are still some unfinished transactions that were not aborted they will hold a lock on a XATT table until they will get rolled back by the transaction timeout. The number of rows in the XATT table is calculated. It is then compared with the excepted number of rows (the trnsaction we know we have committed). The call xaRes.setTransactionTimeout(5); before calling xaRes.start() makes the transactions to get timed out. Is this understandable? Would it be ok to explain this more in comments? > I'm also not sure I understand the following code > + } else { > + // check the timout for associated transactions > + ; > + } > It says it checks the timeout, but it doesn't do anything. The time out is setted up for the transaction at the beggining of the transaction. Different transactions are left in different states as the timeout task will try to roll back transactions in different state / connection association, etc. The code actually does not test the timeout for associated transactions directly, it just left the transaction in a state it is (and it is associated with the connection). > In this try/catch in the test, shouldn't there have been a call to > fail() after xaRes.end()? > + try { > + xaRes.end(xid, XAResource.TMFAIL); > + } catch (XAException ex) { > + if (ex.errorCode < XAException.XA_RBBASE > + || ex.errorCode > XAException.XA_RBEND) > + { > + throw ex; > + } > + } Might be, it will then test something more than just a timeout feature. > In the test, the variables xaConn, xaRes and conn are declared and > initialized right before the for loop, but also at the beginning of > the body of the for loop, so their initial values are always thrown > away. Could the variable declarations be moved inside the for loop > instead? Yes, i missed that.
          Hide
          Julius Stroffek added a comment -

          The tests (sutes.All and derbyall) for v2 patch finished without errors and failures.

          I am attaching a new version of the patch. I am running the tests now.

          Changes compared to a v2 patch:

          • Added comment to EmbedXAResource.start about the test of transaction timeout value to Integer.MAX_VALUE
          • CancelXATransactionTask - field tranState removed, synchronization in run method also removed.
          • Monitor.logThrowable - changed as suggested (sorry I missed getPrintWriter method).
          • XATransactionTest - much more detailed comment added to testXATransactionTimeout method.
          • XATransactionTest - removed initialization of xaRes, xaConn and conn before the for loop.
          • XATransactionTest - Assert.fail() added to corresponding try-catch blok when checking the exception thrown.

          I can not see any chance that the tests could fail due to the change in a code compared to the v2 patch (I ran XATransactionTest alone without failures and changes has no impact on other tests). However, I'll let you know when my test run will finish.

          Show
          Julius Stroffek added a comment - The tests (sutes.All and derbyall) for v2 patch finished without errors and failures. I am attaching a new version of the patch. I am running the tests now. Changes compared to a v2 patch: Added comment to EmbedXAResource.start about the test of transaction timeout value to Integer.MAX_VALUE CancelXATransactionTask - field tranState removed, synchronization in run method also removed. Monitor.logThrowable - changed as suggested (sorry I missed getPrintWriter method). XATransactionTest - much more detailed comment added to testXATransactionTimeout method. XATransactionTest - removed initialization of xaRes, xaConn and conn before the for loop. XATransactionTest - Assert.fail() added to corresponding try-catch blok when checking the exception thrown. I can not see any chance that the tests could fail due to the change in a code compared to the v2 patch (I ran XATransactionTest alone without failures and changes has no impact on other tests). However, I'll let you know when my test run will finish.
          Hide
          Julius Stroffek added a comment -

          All tests (suites.All and derbyall) succeeded for v3 patch.

          Show
          Julius Stroffek added a comment - All tests (suites.All and derbyall) succeeded for v3 patch.
          Hide
          Knut Anders Hatlen added a comment -

          >> Is it specified anywhere what the transaction timeout means? I found a
          >> description in the javadoc for XAResource.setTransactionTimeout(), but
          >> it doesn't say anything about the effect of the timeout.
          >
          > No, there is no exact specification what timeout means.

          Could you explain the rationale for choosing this particular
          behaviour? How do other databases use the timeout?

          I'm asking because it is not quite clear to me how the timeout is
          used, and what users would expect from the timeout. From the
          description of this issue, it seems like the purpose of the timeout is
          to ensure that resources are freed if no commit or rollback is
          issued. Is this the primary use case for the timeout? If so, is the
          code that cancels the running statement necessary? (It does seem to me
          that cancelling statements is something one rather would do with a
          query timeout than with an XA transaction timeout.) And would it make
          more sense to restart the timer each time a method is called on
          XAResource, so that the timeout became some sort of idle timeout? (To
          avoid cases like: transaction manager calls prepare(), the XAResource
          votes OK, and half a second later the XAResource aborts because the
          transaction took too long.)

          Show
          Knut Anders Hatlen added a comment - >> Is it specified anywhere what the transaction timeout means? I found a >> description in the javadoc for XAResource.setTransactionTimeout(), but >> it doesn't say anything about the effect of the timeout. > > No, there is no exact specification what timeout means. Could you explain the rationale for choosing this particular behaviour? How do other databases use the timeout? I'm asking because it is not quite clear to me how the timeout is used, and what users would expect from the timeout. From the description of this issue, it seems like the purpose of the timeout is to ensure that resources are freed if no commit or rollback is issued. Is this the primary use case for the timeout? If so, is the code that cancels the running statement necessary? (It does seem to me that cancelling statements is something one rather would do with a query timeout than with an XA transaction timeout.) And would it make more sense to restart the timer each time a method is called on XAResource, so that the timeout became some sort of idle timeout? (To avoid cases like: transaction manager calls prepare(), the XAResource votes OK, and half a second later the XAResource aborts because the transaction took too long.)
          Hide
          Julius Stroffek added a comment -

          I wanted to solve the similiar problem like DERBY-2220 solved
          but also for the case where the global transaction is not associated
          with any connection. In that case without a timeout that global
          transaction may held locks forever. I do not know what is
          the purpose of timeout in jdbc specification (it is not explained)
          but I see the purpose in just not to held resource for a long time.

          > If so, is the code that cancels the running statement necessary?

          What else could we do in this case? The transaction times out so
          I thing the best way is to cancell everything. If there is a running
          statement and we will not cancel it, so shell we schedule a new
          timeout task later? Or just ignore this transaction? I think that
          doing something else would lead to not quite transparent behaviour.

          The behaviour you are proposing is also acceptable.

          > To avoid cases like: transaction manager calls prepare(), the XAResource
          > votes OK, and half a second later the XAResource aborts because the
          > transaction took too long.)

          This might be handled by cancelling a timeout task also in prepare method.
          After the application calls prepare, RM is allowed to heuristically commit
          or rollback the transaction. This means that it will no longer hold any resources.
          I do not know quite well how the heuristicall commit/rollback works. I'll explore
          this.

          About other DBs:

          IBM JDBC Universal Driver - feature not implemented.
          PostgreSQL - feature not implemented.
          MySQL - feature not implemented.
          Oracle - I do not know. I have no Oracle installed.

          Show
          Julius Stroffek added a comment - I wanted to solve the similiar problem like DERBY-2220 solved but also for the case where the global transaction is not associated with any connection. In that case without a timeout that global transaction may held locks forever. I do not know what is the purpose of timeout in jdbc specification (it is not explained) but I see the purpose in just not to held resource for a long time. > If so, is the code that cancels the running statement necessary? What else could we do in this case? The transaction times out so I thing the best way is to cancell everything. If there is a running statement and we will not cancel it, so shell we schedule a new timeout task later? Or just ignore this transaction? I think that doing something else would lead to not quite transparent behaviour. The behaviour you are proposing is also acceptable. > To avoid cases like: transaction manager calls prepare(), the XAResource > votes OK, and half a second later the XAResource aborts because the > transaction took too long.) This might be handled by cancelling a timeout task also in prepare method. After the application calls prepare, RM is allowed to heuristically commit or rollback the transaction. This means that it will no longer hold any resources. I do not know quite well how the heuristicall commit/rollback works. I'll explore this. About other DBs: — IBM JDBC Universal Driver - feature not implemented. PostgreSQL - feature not implemented. MySQL - feature not implemented. Oracle - I do not know. I have no Oracle installed.
          Hide
          Julius Stroffek added a comment -

          Heuristic branch completition seems not to work in derby. I have not found anything in the docs and I tried to leave a prepared transaction for a while and it remains uncommitted/unrolled back.

          However if the applciation crashes and there are prepared transactions pending, there exists an interface how to maintain those transactions (XAResource.recover). So, it should be ok if we will also cancel the time out task in a call to XAResource.prepare.

          Show
          Julius Stroffek added a comment - Heuristic branch completition seems not to work in derby. I have not found anything in the docs and I tried to leave a prepared transaction for a while and it remains uncommitted/unrolled back. However if the applciation crashes and there are prepared transactions pending, there exists an interface how to maintain those transactions (XAResource.recover). So, it should be ok if we will also cancel the time out task in a call to XAResource.prepare.
          Hide
          Julius Stroffek added a comment -

          I found some small piece about the xa transaction timeout behaviour in DRDA V4 - Vol 3: Distributed Data Management Architecture.

          Page 964:
          The TIMEOUT parameter is a 64-bit binary number which measures time in milliseconds. It
          represents the elapsed time before the transaction is rolled back. The timer is started at the
          beginning of the transaction SYNCCTL(New UOW).
          If the request SYNCCTL(Prepare to Commit) is not received during this time period, the
          transaction will be implicitly rolled back.

          Show
          Julius Stroffek added a comment - I found some small piece about the xa transaction timeout behaviour in DRDA V4 - Vol 3: Distributed Data Management Architecture. Page 964: The TIMEOUT parameter is a 64-bit binary number which measures time in milliseconds. It represents the elapsed time before the transaction is rolled back. The timer is started at the beginning of the transaction SYNCCTL(New UOW). If the request SYNCCTL(Prepare to Commit) is not received during this time period, the transaction will be implicitly rolled back.
          Hide
          Julius Stroffek added a comment -

          I have created an additional patch.

          changes:
          XATransactionState:

          • Added a timeout task cancellation to xa_prepare() method.
          • Common code from xa_commit/rollback/prepare moved
            to xa_finalize()
          • Some more comments added to cancel()

          I am running the tests now. However, the change should not affect the tests. I'll post the results after tests will finish.

          Show
          Julius Stroffek added a comment - I have created an additional patch. changes: XATransactionState: Added a timeout task cancellation to xa_prepare() method. Common code from xa_commit/rollback/prepare moved to xa_finalize() Some more comments added to cancel() I am running the tests now. However, the change should not affect the tests. I'll post the results after tests will finish.
          Hide
          Knut Anders Hatlen added a comment -

          Thanks for investigating this Julo. It also seems like many of the same questions were asked and answered under DERBY-2508 as well. There weren't any objections to the approach as far as I could see, so I guess it is uncontroversial to go for that solution. If there are no objections, I will commit the patch tomorrow.

          Just one final question about the changes in the v4 patch:
          > changes:
          > XATransactionState:
          > - Added a timeout task cancellation to xa_prepare() method.

          Will the locks mentioned in the description be released after xa_prepare()? If they are, this sounds fine.

          Show
          Knut Anders Hatlen added a comment - Thanks for investigating this Julo. It also seems like many of the same questions were asked and answered under DERBY-2508 as well. There weren't any objections to the approach as far as I could see, so I guess it is uncontroversial to go for that solution. If there are no objections, I will commit the patch tomorrow. Just one final question about the changes in the v4 patch: > changes: > XATransactionState: > - Added a timeout task cancellation to xa_prepare() method. Will the locks mentioned in the description be released after xa_prepare()? If they are, this sounds fine.
          Hide
          Julius Stroffek added a comment -

          > Will the locks mentioned in the description be released after xa_prepare()?

          Unfortunatelly, the would not. However, according the JTA spec after calling prepare resource manager is allowed to heuristically commit or rollback the transaction. This feature is not implementedyet in derby - at least I tried this and it does not work and I also have not found anything in docsnorthe code.

          User can use the method XAResource.recover to obtain the list of the prepared transactions and he can commit/rollback the transactions afterwards even after the application server crash. This process is also described in JTA spec, so it is possible to write a client application with some cleanup at the startup.

          The alternative to v4 patch approach is not to cancel the timeout task in a call to prepare function. We can file a JIRA to implement heuristic commit/rollbackof the global transactions and add the cancellation o ftimeout task to xa_prepare with introduction of this feature. Having the implementation of the xa transaction timeout feature temporarily rolling back prepared transactions should not matter because if I would know that my transaction should run 2s and I would like to give a timeout to that transaction I would choose a much higher value like 10s or so just to besurethat my transaction would finish. The interval between prepare and commit is expecting to be very short.

          What is your opinion on this?

          Show
          Julius Stroffek added a comment - > Will the locks mentioned in the description be released after xa_prepare()? Unfortunatelly, the would not. However, according the JTA spec after calling prepare resource manager is allowed to heuristically commit or rollback the transaction. This feature is not implementedyet in derby - at least I tried this and it does not work and I also have not found anything in docsnorthe code. User can use the method XAResource.recover to obtain the list of the prepared transactions and he can commit/rollback the transactions afterwards even after the application server crash. This process is also described in JTA spec, so it is possible to write a client application with some cleanup at the startup. The alternative to v4 patch approach is not to cancel the timeout task in a call to prepare function. We can file a JIRA to implement heuristic commit/rollbackof the global transactions and add the cancellation o ftimeout task to xa_prepare with introduction of this feature. Having the implementation of the xa transaction timeout feature temporarily rolling back prepared transactions should not matter because if I would know that my transaction should run 2s and I would like to give a timeout to that transaction I would choose a much higher value like 10s or so just to besurethat my transaction would finish. The interval between prepare and commit is expecting to be very short. What is your opinion on this?
          Hide
          Julius Stroffek added a comment -

          Tests (suites.All and derbyall) of v4 patch finished without errors.

          Show
          Julius Stroffek added a comment - Tests (suites.All and derbyall) of v4 patch finished without errors.
          Hide
          Knut Anders Hatlen added a comment -

          > What is your opinion on this?

          If it is possible that we hold on to resources forever if we cancel the timer on prepare, I would think it's safer not to do it. I also think it would be easier to explain the behaviour if we don't cancel the timer until the transaction has completed. Do you see any problem with taking the v4 patch and removing the call to xa_finalize() in xa_prepare()?

          Show
          Knut Anders Hatlen added a comment - > What is your opinion on this? If it is possible that we hold on to resources forever if we cancel the timer on prepare, I would think it's safer not to do it. I also think it would be easier to explain the behaviour if we don't cancel the timer until the transaction has completed. Do you see any problem with taking the v4 patch and removing the call to xa_finalize() in xa_prepare()?
          Hide
          Julius Stroffek added a comment -

          I changed the patch. Running the tests now. I do not see any chance that these tests could fail due to a change in v5 patch.

          Changes:

          • removed a call to XATransactionState.xa_finalize from XATransactionState.xa_prepare
          • changed comments stating that XATransactionState.isFinished is assigned to true also during a prepare.
          Show
          Julius Stroffek added a comment - I changed the patch. Running the tests now. I do not see any chance that these tests could fail due to a change in v5 patch. Changes: removed a call to XATransactionState.xa_finalize from XATransactionState.xa_prepare changed comments stating that XATransactionState.isFinished is assigned to true also during a prepare.
          Hide
          Knut Anders Hatlen added a comment -

          Thank you, Julo! I committed v5 to trunk with revision 547674.
          Please also file a JIRA issue for documenting the derby.jdbc.xaTransactionTimeout property. I think it is OK to leave it undocumented for the 10.3 release, but it would be good to have it documented in the reference manual some time.

          Show
          Knut Anders Hatlen added a comment - Thank you, Julo! I committed v5 to trunk with revision 547674. Please also file a JIRA issue for documenting the derby.jdbc.xaTransactionTimeout property. I think it is OK to leave it undocumented for the 10.3 release, but it would be good to have it documented in the reference manual some time.
          Hide
          A B added a comment -

          I think the changes for this issue have led to a couple of javadoc errors...?

          [javadoc] java\engine\org\apache\derby\jdbc\EmbedXAResource.java:530: warning - Tag @see: reference not found: Property.PROP_XA_TRANSACTION_TIMEOUT
          [javadoc] java\engine\org\apache\derby\jdbc\EmbedXAResource.java:530: warning - Tag @see: reference not found: Property.DEFAULT_XA_TRANSACTION_TIMEOUT
          [javadoc] java\engine\org\apache\derby\jdbc\XATransactionState.java:337: warning - @param argument "onPhase" is not a parameter name.

          Show
          A B added a comment - I think the changes for this issue have led to a couple of javadoc errors...? [javadoc] java\engine\org\apache\derby\jdbc\EmbedXAResource.java:530: warning - Tag @see: reference not found: Property.PROP_XA_TRANSACTION_TIMEOUT [javadoc] java\engine\org\apache\derby\jdbc\EmbedXAResource.java:530: warning - Tag @see: reference not found: Property.DEFAULT_XA_TRANSACTION_TIMEOUT [javadoc] java\engine\org\apache\derby\jdbc\XATransactionState.java:337: warning - @param argument "onPhase" is not a parameter name.
          Hide
          Julius Stroffek added a comment -

          I'll fix these and post a patch.

          Show
          Julius Stroffek added a comment - I'll fix these and post a patch.
          Hide
          A B added a comment -

          > I'll fix these and post a patch.

          Actually, it looks likes Rick already took care of them:

          URL: http://svn.apache.org/viewvc?view=rev&rev=548720

          If the changes look good to you, then no further work is required. Thanks for being willing, though

          Show
          A B added a comment - > I'll fix these and post a patch. Actually, it looks likes Rick already took care of them: URL: http://svn.apache.org/viewvc?view=rev&rev=548720 If the changes look good to you, then no further work is required. Thanks for being willing, though
          Hide
          Julius Stroffek added a comment -

          Seems to be already fixed. Thanks.

          Show
          Julius Stroffek added a comment - Seems to be already fixed. Thanks.

            People

            • Assignee:
              Julius Stroffek
              Reporter:
              Julius Stroffek
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development