Details
-
Bug
-
Status: Resolved
-
Critical
-
Resolution: Fixed
-
2.1.2
-
None
Description
Related to AMBARI-13245, Ambari can stop functioning with dead threads during an request that's currently in progress. The cause of this problem appears to be a performance fix which processes StageEntity instances asynchronously to construct Stage instances.
Consider:
- We have n StageEntity instances from the database, each with a reference to the single EntityManager instance that retrieved it from the DAO
- Now, we use Parallel and spawn a bunch of threads to construct Stage instances from the StageEnitty
- However, StageEntity has LAZY associations which are accessed in the construction of Stage - however, those associations are retrieved with the same EntityManager instance.
- Becuase we have multiple threads constructing {{Stage}}s, this means we have concurrent access to the same EntityManager which is a no-no.
It was previously thought that only a single threadpool in Parallel would prevent this issue. However, it occurred again during a stack distribution of a running cluster:
Exception Description: A signal was attempted before wait() on ConcurrencyManager. This normally means that an attempt was made to commit or rollback a transaction before it was started, or to rollback a transaction twice. at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:192) at org.apache.ambari.server.utils.Parallel.forLoop(Parallel.java:214) at org.apache.ambari.server.utils.Parallel.forLoop(Parallel.java:128) at org.apache.ambari.server.actionmanager.ActionDBAccessorImpl.getStagesInProgress(ActionDBAccessorImpl.java:215) at org.apache.ambari.server.actionmanager.ActionScheduler.doWork(ActionScheduler.java:230) at org.apache.ambari.server.actionmanager.ActionScheduler.run(ActionScheduler.java:195) at java.lang.Thread.run(Thread.java:745) Caused by: Exception [EclipseLink-2004] (Eclipse Persistence Services - 2.5.2.v20140319-9ad6abd): org.eclipse.persistence.exceptions.ConcurrencyException Exception Description: A signal was attempted before wait() on ConcurrencyManager. This normally means that an attempt was made to commit or rollback a transaction before it was started, or to rollback a transaction twice. at org.eclipse.persistence.exceptions.ConcurrencyException.signalAttemptedBeforeWait(ConcurrencyException.java:84) at org.eclipse.persistence.internal.helper.ConcurrencyManager.releaseReadLock(ConcurrencyManager.java:468) at org.eclipse.persistence.internal.identitymaps.CacheKey.releaseReadLock(CacheKey.java:468) at org.eclipse.persistence.internal.sessions.UnitOfWorkImpl.cloneAndRegisterObject(UnitOfWorkImpl.java:1044) at org.eclipse.persistence.internal.sessions.UnitOfWorkImpl.cloneAndRegisterObject(UnitOfWorkImpl.java:955) at org.eclipse.persistence.internal.sessions.UnitOfWorkIdentityMapAccessor.getAndCloneCacheKeyFromParent(UnitOfWorkIdentityMapAccessor.java:209) at org.eclipse.persistence.internal.sessions.UnitOfWorkIdentityMapAccessor.getFromIdentityMap(UnitOfWorkIdentityMapAccessor.java:137) at org.eclipse.persistence.internal.sessions.UnitOfWorkImpl.registerExistingObject(UnitOfWorkImpl.java:3942) at org.eclipse.persistence.internal.sessions.UnitOfWorkImpl.registerExistingObject(UnitOfWorkImpl.java:3894) at org.eclipse.persistence.mappings.CollectionMapping.buildElementUnitOfWorkClone(CollectionMapping.java:308) at org.eclipse.persistence.mappings.CollectionMapping.buildElementClone(CollectionMapping.java:321) at org.eclipse.persistence.internal.queries.ContainerPolicy.addNextValueFromIteratorInto(ContainerPolicy.java:217) at org.eclipse.persistence.mappings.CollectionMapping.buildCloneForPartObject(CollectionMapping.java:223) at org.eclipse.persistence.internal.indirection.UnitOfWorkQueryValueHolder.buildCloneFor(UnitOfWorkQueryValueHolder.java:60) at org.eclipse.persistence.internal.indirection.UnitOfWorkValueHolder.instantiateImpl(UnitOfWorkValueHolder.java:173) at org.eclipse.persistence.internal.indirection.UnitOfWorkValueHolder.instantiate(UnitOfWorkValueHolder.java:234) at org.eclipse.persistence.internal.indirection.DatabaseValueHolder.getValue(DatabaseValueHolder.java:89) at org.eclipse.persistence.indirection.IndirectList.buildDelegate(IndirectList.java:252) at org.eclipse.persistence.indirection.IndirectList.getDelegate(IndirectList.java:423) at org.eclipse.persistence.indirection.IndirectList$1.<init>(IndirectList.java:551) at org.eclipse.persistence.indirection.IndirectList.listIterator(IndirectList.java:550) at org.eclipse.persistence.indirection.IndirectList.iterator(IndirectList.java:514) at org.apache.ambari.server.actionmanager.Stage.<init>(Stage.java:157) at org.apache.ambari.server.actionmanager.StageFactoryImpl.createExisting(StageFactoryImpl.java:77) at org.apache.ambari.server.actionmanager.ActionDBAccessorImpl$1.run(ActionDBAccessorImpl.java:218) at org.apache.ambari.server.actionmanager.ActionDBAccessorImpl$1.run(ActionDBAccessorImpl.java:215) at org.apache.ambari.server.utils.Parallel$1.call(Parallel.java:178) at org.apache.ambari.server.utils.Parallel$1.call(Parallel.java:173) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)