Details
-
Bug
-
Status: Resolved
-
Blocker
-
Resolution: Fixed
-
2.2.2
-
None
Description
In large deployments where the number of hosts * the number of components is large (10,000 for example), then the ConfigHelper.isStale() method could make 10,000's of database queries every minute.
Consider a 3-minute trace:
server.persistence.properties.eclipselink.profiler=PerformanceMonitor
Time = 3 minutes
Counter:ReadAllQuery:org.apache.ambari.server.orm.entities.ClusterConfigMappingEntity:null 11,716 Timer:ReadAllQuery:org.apache.ambari.server.orm.entities.ClusterConfigMappingEntity:null 80,520,541,000 Timer:ReadAllQuery:org.apache.ambari.server.orm.entities.ClusterConfigMappingEntity:null:ObjectBuilding 19,741,257,000 Timer:ReadAllQuery:org.apache.ambari.server.orm.entities.ClusterConfigMappingEntity:null:QueryPreparation 414,000 Timer:ReadAllQuery:org.apache.ambari.server.orm.entities.ClusterConfigMappingEntity:null:RowFetch 6,032,673,000 Timer:ReadAllQuery:org.apache.ambari.server.orm.entities.ClusterConfigMappingEntity:null:SqlGeneration 79,000 Timer:ReadAllQuery:org.apache.ambari.server.orm.entities.ClusterConfigMappingEntity:null:SqlPrepare 232,532,000 Timer:ReadAllQuery:org.apache.ambari.server.orm.entities.ClusterConfigMappingEntity:null:StatementExecute 33,624,662,000
The ClusterConfigMappingEntity:null is requested over 10,000 times. If this value exceeds the cache of stale configs (or even if it doesn't) this causes a massive performance delay in the Jetty threads since the database is being hammered and other PropertyProviders must wait until it's done.
- Setting the server.cache.isStale.expiration value to 28800 improves the behavior of the system
- Ambari goes from totally unsuable to usable
- Startup is still an issue as the code still has to make 10,000's of calls, but those flatten out after the cache is populated. So, during startup, it's unresponsive.
- After startup, you can use Ambari to send commands and browse around without delay
- If you change a config, however, the problem returns as the cache is emptied and we make 10,000 more calls. This causes Ambari to be unresponsive until the cache is repopulated
There are a ton of threads stuck at:
"qtp-ambari-client-275" prio=10 tid=0x00007f9de801b800 nid=0x6735 waiting for monitor entry [0x00007f9dd66e3000] java.lang.Thread.State: BLOCKED (on object monitor) at org.apache.ambari.server.controller.internal.AbstractProviderModule.checkInit(AbstractProviderModule.java:805) - waiting to lock <0x00007fa0744cc3b0> (a org.apache.ambari.server.controller.internal.DefaultProviderModule) at org.apache.ambari.server.controller.internal.AbstractProviderModule.getMetricsServiceType(AbstractProviderModule.java:275)
They're all blocked by qtp-ambari-client-247:
"qtp-ambari-client-247" prio=10 tid=0x00007f9dd8001000 nid=0x5915 runnable [0x00007f9ddd0c2000] java.lang.Thread.State: RUNNABLE at com.mysql.jdbc.MysqlIO.readFully(MysqlIO.java:2961) at com.mysql.jdbc.MysqlIO.nextRowFast(MysqlIO.java:2159) at com.mysql.jdbc.MysqlIO.nextRow(MysqlIO.java:1964) at com.mysql.jdbc.MysqlIO.readSingleRowSet(MysqlIO.java:3316) at com.mysql.jdbc.MysqlIO.getResultSet(MysqlIO.java:463) at com.mysql.jdbc.MysqlIO.readResultsForQueryOrUpdate(MysqlIO.java:3040) at com.mysql.jdbc.MysqlIO.readAllResults(MysqlIO.java:2288) at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2681) at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2551) - locked <0x00007fa075265510> (a com.mysql.jdbc.JDBC4Connection) at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:1861) - locked <0x00007fa075265510> (a com.mysql.jdbc.JDBC4Connection) at com.mysql.jdbc.PreparedStatement.executeQuery(PreparedStatement.java:1962) - locked <0x00007fa075265510> (a com.mysql.jdbc.JDBC4Connection) at com.mchange.v2.c3p0.impl.NewProxyPreparedStatement.executeQuery(NewProxyPreparedStatement.java:353) at org.eclipse.persistence.internal.databaseaccess.DatabaseAccessor.executeSelect(DatabaseAccessor.java:1009) at org.eclipse.persistence.internal.databaseaccess.DatabaseAccessor.basicExecuteCall(DatabaseAccessor.java:644) at org.eclipse.persistence.internal.databaseaccess.DatabaseAccessor.executeCall(DatabaseAccessor.java:560) at org.eclipse.persistence.internal.sessions.AbstractSession.basicExecuteCall(AbstractSession.java:2055) at org.eclipse.persistence.sessions.server.ServerSession.executeCall(ServerSession.java:570) at org.eclipse.persistence.internal.queries.DatasourceCallQueryMechanism.executeCall(DatasourceCallQueryMechanism.java:242) at org.eclipse.persistence.internal.queries.DatasourceCallQueryMechanism.executeCall(DatasourceCallQueryMechanism.java:228) at org.eclipse.persistence.internal.queries.DatasourceCallQueryMechanism.executeSelectCall(DatasourceCallQueryMechanism.java:299) at org.eclipse.persistence.internal.queries.DatasourceCallQueryMechanism.selectAllRows(DatasourceCallQueryMechanism.java:694) at org.eclipse.persistence.internal.queries.ExpressionQueryMechanism.selectAllRowsFromTable(ExpressionQueryMechanism.java:2740) at org.eclipse.persistence.internal.queries.ExpressionQueryMechanism.selectAllRows(ExpressionQueryMechanism.java:2693) at org.eclipse.persistence.queries.ReadAllQuery.executeObjectLevelReadQuery(ReadAllQuery.java:559) at org.eclipse.persistence.queries.ObjectLevelReadQuery.executeDatabaseQuery(ObjectLevelReadQuery.java:1175) at org.eclipse.persistence.queries.DatabaseQuery.execute(DatabaseQuery.java:904) at org.eclipse.persistence.queries.ObjectLevelReadQuery.execute(ObjectLevelReadQuery.java:1134) at org.eclipse.persistence.queries.ReadAllQuery.execute(ReadAllQuery.java:460) at org.eclipse.persistence.queries.ObjectLevelReadQuery.executeInUnitOfWork(ObjectLevelReadQuery.java:1222) at org.eclipse.persistence.internal.sessions.UnitOfWorkImpl.internalExecuteQuery(UnitOfWorkImpl.java:2896) at org.eclipse.persistence.internal.sessions.AbstractSession.executeQuery(AbstractSession.java:1857) at org.eclipse.persistence.internal.sessions.AbstractSession.executeQuery(AbstractSession.java:1839) at org.eclipse.persistence.internal.sessions.AbstractSession.executeQuery(AbstractSession.java:1804) at org.eclipse.persistence.internal.jpa.QueryImpl.executeReadQuery(QueryImpl.java:258) at org.eclipse.persistence.internal.jpa.QueryImpl.getResultList(QueryImpl.java:473) at org.apache.ambari.server.orm.dao.DaoUtils.selectList(DaoUtils.java:62) at org.apache.ambari.server.orm.dao.ClusterDAO.getClusterConfigMappingEntitiesByCluster(ClusterDAO.java:240)
Attachments
Attachments
Issue Links
- links to