Uploaded image for project: 'Geronimo'
  1. Geronimo
  2. GERONIMO-6543

Aries/Geronimo XA transaction recovery not working for heuristically completed transactions

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • None
    • transaction manager
    • Security Level: public (Regular issues)
    • None
    • Apache Karaf 2.3.0, JBoss Fuse 6.1-redhat-610379

    Description

      We´re facing a problem with XA transaction recovery when a resource manager (like in our case Websphere MQ) is reporting heuristically completed transactions.
      The flow goes like that (see org.apache.geronimo.transaction.manager.RecoveryImpl.recoverResourceManager(NamedXAResource)):

      • We´re starting JBoss Fuse, data/txlog/* has been deleted before start.
      GenericResourceManager is starting recovery
      • The NamedXAResourceFactory is enlisted
      • Geronimo calls xa_recover(TM_STARTRSCAN | TMENDRSCAN) on the XA Resource (Websphere MQ)
      • Websphere MQ reports a XID which is to be recovered, Aries TxManager does not know about the XID so it tries to send xa_rollback to MQ
      • MQ already had the XID heuristically rolled back before, so it answers with XA_HEURRB (6).
      • Geronimo logs the exception, but does not do anything about it

      Since Geronimo does not do anything about the pending transaction, it stays pending withing Websphere MQ, and the same error will occur upon next restart.

      What´s missing here from our perspective is a call to xa_forget after receiving a XA_HEURB.

      Btw, in org.apache.geronimo.transaction.manager.RollbackTask.run() XA_HEURRB seems to be handled correctly by sending an xa_forget.

      Stack trace below.

      We´re using following bundles:

      [ 281] [Active     ] [            ] [       ] [   50] mvn:org.apache.geronimo.components/geronimo-connector/3.1.1
      [ 116] [Active     ] [            ] [       ] [   30] mvn:org.apache.aries.transaction/org.apache.aries.transaction.manager/1.0.1.redhat-610379
      [ 118] [Active     ] [            ] [       ] [   30] mvn:org.apache.aries.transaction/org.apache.aries.transaction.jdbc/1.0.1.redhat-610379
      [ 569] [Active     ] [Created     ] [       ] [   50] mvn:org.apache.activemq/activemq-osgi/5.9.0.redhat-610379
      

      Looking at the upstream source code the problem exists there, too: https://github.com/apache/geronimo-txmanager/blob/trunk/geronimo-transaction/src/main/java/org/apache/geronimo/transaction/manager/RecoveryImpl.java#L127.

      2015-04-29 11:55:28,184 | TRACE | FelixStartLevel | WrapperNamedXAResource | 116  | Recover called on XAResource wmq-wpreconXA
      flags:  TMENDRSCAN TMSTARTRSCAN remaining: 25165824
      2015-04-29 11:55:28,187 | TRACE | FelixStartLevel | WrapperNamedXAResource | 116  | Rollback called on XAResource wmq-wpreconXA
      Xid: {JmqiXid@545310144: formatId(4765526f), gtrid(8df91ce04c0100006f72672e6170616368652e61726965732e7472616e73616374696f6e00000000000000000000000000000000000000000000000000000000), bqual(01000000308807e04c0100006170616368652e61726965732e7472616e73616374696f6e00000000000000000000000000000000000000000000000000000000)}
      2015-04-29 11:55:28,190 | ERROR | FelixStartLevel | Recovery | 116  | Could not roll back
      javax.transaction.xa.XAException: Methode 'xa_rollback' ist mit Fehlercode '6' fehlgeschlagen.
              at com.ibm.mq.jmqi.JmqiXAResource.rollback(JmqiXAResource.java:861)
              at org.apache.geronimo.transaction.manager.WrapperNamedXAResource.rollback(WrapperNamedXAResource.java:100)
              at org.apache.geronimo.transaction.manager.RecoveryImpl.recoverResourceManager(RecoveryImpl.java:127)
              at org.apache.geronimo.transaction.manager.RecoverTask.run(RecoverTask.java:52)
              at org.apache.geronimo.transaction.manager.TransactionManagerImpl.registerNamedXAResourceFactory(TransactionManagerImpl.java:353)
              at Proxyc5f66ef2_fefc_4587_80b8_fb531eb6b156.registerNamedXAResourceFactory(Unknown Source)
              at org.apache.activemq.jms.pool.GenericResourceManager$Recovery.recover(GenericResourceManager.java:144)
              at org.apache.activemq.jms.pool.GenericResourceManager.recoverResource(GenericResourceManager.java:77)
              at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)[:1.7.0_45]
              at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)[:1.7.0_45]
              at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)[:1.7.0_45]
              at java.lang.reflect.Method.invoke(Method.java:606)[:1.7.0_45]
              at org.apache.aries.blueprint.utils.ReflectionUtils.invoke(ReflectionUtils.java:297)[9:org.apache.aries.blueprint.core:1.0.1.redhat-610379]
             at org.apache.aries.blueprint.container.BeanRecipe.invoke(BeanRecipe.java:958)[9:org.apache.aries.blueprint.core:1.0.1.redhat-610379]
              at org.apache.aries.blueprint.container.BeanRecipe.runBeanProcInit(BeanRecipe.java:712)[9:org.apache.aries.blueprint.core:1.0.1.redhat-610379]
              at org.apache.aries.blueprint.container.BeanRecipe.internalCreate2(BeanRecipe.java:824)[9:org.apache.aries.blueprint.core:1.0.1.redhat-610379]
              at org.apache.aries.blueprint.container.BeanRecipe.internalCreate(BeanRecipe.java:787)[9:org.apache.aries.blueprint.core:1.0.1.redhat-610379]
              at org.apache.aries.blueprint.di.AbstractRecipe$1.call(AbstractRecipe.java:79)[9:org.apache.aries.blueprint.core:1.0.1.redhat-610379]
              at java.util.concurrent.FutureTask.run(FutureTask.java:262)[:1.7.0_45]
              at org.apache.aries.blueprint.di.AbstractRecipe.create(AbstractRecipe.java:88)[9:org.apache.aries.blueprint.core:1.0.1.redhat-610379]
              at org.apache.aries.blueprint.container.BlueprintRepository.createInstances(BlueprintRepository.java:245)[9:org.apache.aries.blueprint.core:1.0.1.redhat-610379]
              at org.apache.aries.blueprint.container.BlueprintRepository.createAll(BlueprintRepository.java:183)[9:org.apache.aries.blueprint.core:1.0.1.redhat-610379]
              at org.apache.aries.blueprint.container.BlueprintContainerImpl.instantiateEagerComponents(BlueprintContainerImpl.java:676)[9:org.apache.aries.blueprint.core:1.0.1.redhat-610379]
              at org.apache.aries.blueprint.container.BlueprintContainerImpl.doRun(BlueprintContainerImpl.java:374)[9:org.apache.aries.blueprint.core:1.0.1.redhat-610379]
              at org.apache.aries.blueprint.container.BlueprintContainerImpl.run(BlueprintContainerImpl.java:261)[9:org.apache.aries.blueprint.core:1.0.1.redhat-610379]
              at org.apache.aries.blueprint.container.BlueprintExtender.createContainer(BlueprintExtender.java:270)[9:org.apache.aries.blueprint.core:1.0.1.redhat-610379]
              at org.apache.aries.blueprint.container.BlueprintExtender.modifiedBundle(BlueprintExtender.java:233)[9:org.apache.aries.blueprint.core:1.0.1.redhat-610379]
              at org.apache.aries.util.tracker.hook.BundleHookBundleTracker$Tracked.customizerModified(BundleHookBundleTracker.java:500)[11:org.apache.aries.util:1.0.1.redhat-610379]
              at org.apache.aries.util.tracker.hook.BundleHookBundleTracker$Tracked.customizerModified(BundleHookBundleTracker.java:433)[11:org.apache.aries.util:1.0.1.redhat-610379]
              at org.apache.aries.util.tracker.hook.BundleHookBundleTracker$AbstractTracked.track(BundleHookBundleTracker.java:725)[11:org.apache.aries.util:1.0.1.redhat-610379]
              at org.apache.aries.util.tracker.hook.BundleHookBundleTracker$Tracked.bundleChanged(BundleHookBundleTracker.java:463)[11:org.apache.aries.util:1.0.1.redhat-610379]
              at org.apache.aries.util.tracker.hook.BundleHookBundleTracker$BundleEventHook.event(BundleHookBundleTracker.java:422)[11:org.apache.aries.util:1.0.1.redhat-610379]
              at org.apache.felix.framework.util.SecureAction.invokeBundleEventHook(SecureAction.java:1103)[org.apache.felix.framework-4.0.3.redhat-610379.jar:]
              at org.apache.felix.framework.util.EventDispatcher.createWhitelistFromHooks(EventDispatcher.java:696)[org.apache.felix.framework-4.0.3.redhat-610379.jar:]
              at org.apache.felix.framework.util.EventDispatcher.fireBundleEvent(EventDispatcher.java:484)[org.apache.felix.framework-4.0.3.redhat-610379.jar:]
              at org.apache.felix.framework.Felix.fireBundleEvent(Felix.java:4650)[org.apache.felix.framework-4.0.3.redhat-610379.jar:]
              at org.apache.felix.framework.Felix$4.run(Felix.java:2123)[org.apache.felix.framework-4.0.3.redhat-610379.jar:]
              at org.apache.felix.framework.Felix.runInContext(Felix.java:2147)[org.apache.felix.framework-4.0.3.redhat-610379.jar:]
              at org.apache.felix.framework.Felix.startBundle(Felix.java:2121)[org.apache.felix.framework-4.0.3.redhat-610379.jar:]
              at org.apache.felix.framework.Felix.setActiveStartLevel(Felix.java:1317)[org.apache.felix.framework-4.0.3.redhat-610379.jar:]
              at org.apache.felix.framework.FrameworkStartLevelImpl.run(FrameworkStartLevelImpl.java:304)[org.apache.felix.framework-4.0.3.redhat-610379.jar:]
              at java.lang.Thread.run(Thread.java:744)[:1.7.0_45]
      

      ==============

      2. start of fuse: note that the same transaction is again causing problems.

      2015-04-29 11:56:44,623 | TRACE | FelixStartLevel | WrapperNamedXAResource | 116  | Rollback called on XAResource wmq-wpreconXA
      Xid: {JmqiXid@67301323: formatId(4765526f), gtrid(8df91ce04c0100006f72672e6170616368652e61726965732e7472616e73616374696f6e00000000000000000000000000000000000000000000000000000000), bqual(01000000308807e04c0100006170616368652e61726965732e7472616e73616374696f6e00000000000000000000000000000000000000000000000000000000)}
      2015-04-29 11:56:44,626 | ERROR | FelixStartLevel | Recovery | 116  | Could not roll back
      javax.transaction.xa.XAException: Methode 'xa_rollback' ist mit Fehlercode '6' fehlgeschlagen.
              at com.ibm.mq.jmqi.JmqiXAResource.rollback(JmqiXAResource.java:861)
              at org.apache.geronimo.transaction.manager.WrapperNamedXAResource.rollback(WrapperNamedXAResource.java:100)
              at org.apache.geronimo.transaction.manager.RecoveryImpl.recoverResourceManager(RecoveryImpl.java:127)
      

      Attachments

        1. GERONIMO-6543.patch
          5 kB
          Guillaume Nodet

        Activity

          People

            gnodet Guillaume Nodet
            j0xaf Jörn Gersdorf
            Votes:
            1 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: