Uploaded image for project: 'Ambari'
  1. Ambari
  2. AMBARI-13291

JPA Threads Stuck In ConcurrencyManager Freezes Ambari

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 2.1.2
    • Fix Version/s: 2.2.0
    • Component/s: ambari-server
    • Labels:
      None

      Description

      Related to AMBARI-13245, Ambari can stop functioning with dead threads during an request that's currently in progress. The cause of this problem appears to be a performance fix which processes StageEntity instances asynchronously to construct Stage instances.

      Consider:

      • We have n StageEntity instances from the database, each with a reference to the single EntityManager instance that retrieved it from the DAO
      • Now, we use Parallel and spawn a bunch of threads to construct Stage instances from the StageEnitty
      • However, StageEntity has LAZY associations which are accessed in the construction of Stage - however, those associations are retrieved with the same EntityManager instance.
      • Becuase we have multiple threads constructing {{Stage}}s, this means we have concurrent access to the same EntityManager which is a no-no.

      It was previously thought that only a single threadpool in Parallel would prevent this issue. However, it occurred again during a stack distribution of a running cluster:

      Exception Description: A signal was attempted before wait() on ConcurrencyManager. This normally means that an attempt was made to
      commit or rollback a transaction before it was started, or to rollback a transaction twice.
      	at java.util.concurrent.FutureTask.report(FutureTask.java:122)
      	at java.util.concurrent.FutureTask.get(FutureTask.java:192)
      	at org.apache.ambari.server.utils.Parallel.forLoop(Parallel.java:214)
      	at org.apache.ambari.server.utils.Parallel.forLoop(Parallel.java:128)
      	at org.apache.ambari.server.actionmanager.ActionDBAccessorImpl.getStagesInProgress(ActionDBAccessorImpl.java:215)
      	at org.apache.ambari.server.actionmanager.ActionScheduler.doWork(ActionScheduler.java:230)
      	at org.apache.ambari.server.actionmanager.ActionScheduler.run(ActionScheduler.java:195)
      	at java.lang.Thread.run(Thread.java:745)
      Caused by: Exception [EclipseLink-2004] (Eclipse Persistence Services - 2.5.2.v20140319-9ad6abd): org.eclipse.persistence.exceptions.ConcurrencyException
      Exception Description: A signal was attempted before wait() on ConcurrencyManager. This normally means that an attempt was made to
      commit or rollback a transaction before it was started, or to rollback a transaction twice.
      	at org.eclipse.persistence.exceptions.ConcurrencyException.signalAttemptedBeforeWait(ConcurrencyException.java:84)
      	at org.eclipse.persistence.internal.helper.ConcurrencyManager.releaseReadLock(ConcurrencyManager.java:468)
      	at org.eclipse.persistence.internal.identitymaps.CacheKey.releaseReadLock(CacheKey.java:468)
      	at org.eclipse.persistence.internal.sessions.UnitOfWorkImpl.cloneAndRegisterObject(UnitOfWorkImpl.java:1044)
      	at org.eclipse.persistence.internal.sessions.UnitOfWorkImpl.cloneAndRegisterObject(UnitOfWorkImpl.java:955)
      	at org.eclipse.persistence.internal.sessions.UnitOfWorkIdentityMapAccessor.getAndCloneCacheKeyFromParent(UnitOfWorkIdentityMapAccessor.java:209)
      	at org.eclipse.persistence.internal.sessions.UnitOfWorkIdentityMapAccessor.getFromIdentityMap(UnitOfWorkIdentityMapAccessor.java:137)
      	at org.eclipse.persistence.internal.sessions.UnitOfWorkImpl.registerExistingObject(UnitOfWorkImpl.java:3942)
      	at org.eclipse.persistence.internal.sessions.UnitOfWorkImpl.registerExistingObject(UnitOfWorkImpl.java:3894)
      	at org.eclipse.persistence.mappings.CollectionMapping.buildElementUnitOfWorkClone(CollectionMapping.java:308)
      	at org.eclipse.persistence.mappings.CollectionMapping.buildElementClone(CollectionMapping.java:321)
      	at org.eclipse.persistence.internal.queries.ContainerPolicy.addNextValueFromIteratorInto(ContainerPolicy.java:217)
      	at org.eclipse.persistence.mappings.CollectionMapping.buildCloneForPartObject(CollectionMapping.java:223)
      	at org.eclipse.persistence.internal.indirection.UnitOfWorkQueryValueHolder.buildCloneFor(UnitOfWorkQueryValueHolder.java:60)
      	at org.eclipse.persistence.internal.indirection.UnitOfWorkValueHolder.instantiateImpl(UnitOfWorkValueHolder.java:173)
      	at org.eclipse.persistence.internal.indirection.UnitOfWorkValueHolder.instantiate(UnitOfWorkValueHolder.java:234)
      	at org.eclipse.persistence.internal.indirection.DatabaseValueHolder.getValue(DatabaseValueHolder.java:89)
      	at org.eclipse.persistence.indirection.IndirectList.buildDelegate(IndirectList.java:252)
      	at org.eclipse.persistence.indirection.IndirectList.getDelegate(IndirectList.java:423)
      	at org.eclipse.persistence.indirection.IndirectList$1.<init>(IndirectList.java:551)
      	at org.eclipse.persistence.indirection.IndirectList.listIterator(IndirectList.java:550)
      	at org.eclipse.persistence.indirection.IndirectList.iterator(IndirectList.java:514)
      	at org.apache.ambari.server.actionmanager.Stage.<init>(Stage.java:157)
      	at org.apache.ambari.server.actionmanager.StageFactoryImpl.createExisting(StageFactoryImpl.java:77)
      	at org.apache.ambari.server.actionmanager.ActionDBAccessorImpl$1.run(ActionDBAccessorImpl.java:218)
      	at org.apache.ambari.server.actionmanager.ActionDBAccessorImpl$1.run(ActionDBAccessorImpl.java:215)
      	at org.apache.ambari.server.utils.Parallel$1.call(Parallel.java:178)
      	at org.apache.ambari.server.utils.Parallel$1.call(Parallel.java:173)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
      	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
      

        Attachments

          Activity

            People

            • Assignee:
              jonathan.hurley Jonathan Hurley
              Reporter:
              jonathan.hurley Jonathan Hurley
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: