Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
Description
We see failures in QE tests with cannot allocate connections errors. The exception stack like following:
2020-09-29T18:44:26,563 INFO [Heartbeater-0]: txn.TxnHandler (TxnHandler.java:checkRetryable(3733)) - Non-retryable error in heartbeat(HeartbeatRequest(lockid:0, txnid:11908)) : Cannot get a connection, general error (SQLState=null, ErrorCode=0) 2020-09-29T18:44:26,564 ERROR [Heartbeater-0]: metastore.RetryingHMSHandler (RetryingHMSHandler.java:invokeInternal(201)) - MetaException(message:Unable to select from transaction database org.apache.commons.dbcp.SQLNestedException: Cannot get a connection, general error at org.apache.commons.dbcp.PoolingDataSource.getConnection(PoolingDataSource.java:118) at org.apache.hadoop.hive.metastore.txn.TxnHandler.getDbConn(TxnHandler.java:3605) at org.apache.hadoop.hive.metastore.txn.TxnHandler.getDbConn(TxnHandler.java:3598) at org.apache.hadoop.hive.metastore.txn.TxnHandler.heartbeat(TxnHandler.java:2739) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.heartbeat(HiveMetaStore.java:8452) at sun.reflect.GeneratedMethodAccessor415.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108) at com.sun.proxy.$Proxy63.heartbeat(Unknown Source) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.heartbeat(HiveMetaStoreClient.java:3247) at sun.reflect.GeneratedMethodAccessor414.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:213) at com.sun.proxy.$Proxy64.heartbeat(Unknown Source) at org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.heartbeat(DbTxnManager.java:671) at org.apache.hadoop.hive.ql.lockmgr.DbTxnManager$Heartbeater.lambda$run$0(DbTxnManager.java:1102) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1898) at org.apache.hadoop.hive.ql.lockmgr.DbTxnManager$Heartbeater.run(DbTxnManager.java:1101) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.InterruptedException at java.lang.Object.wait(Native Method) at org.apache.commons.pool.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:1112) at org.apache.commons.dbcp.PoolingDataSource.getConnection(PoolingDataSource.java:106) ... 29 more ) at org.apache.hadoop.hive.metastore.txn.TxnHandler.heartbeat(TxnHandler.java:2747) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.heartbeat(HiveMetaStore.java:8452) at sun.reflect.GeneratedMethodAccessor415.invoke(Unknown Source)
and
Caused by: java.util.NoSuchElementException: Timeout waiting for idle object at org.apache.commons.pool.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:1134) at org.apache.commons.dbcp.PoolingDataSource.getConnection(PoolingDataSource.java:106) ... 53 more ) at org.apache.hadoop.hive.metastore.txn.TxnHandler.cleanupRecords(TxnHandler.java:3375) at org.apache.hadoop.hive.metastore.AcidEventListener.onDropTable(AcidEventListener.java:65) at org.apache.hadoop.hive.metastore.MetaStoreListenerNotifier$19.notify(MetaStoreListenerNotifier.java:103) at org.apache.hadoop.hive.metastore.MetaStoreListenerNotifier.notifyEvent(MetaStoreListenerNotifier.java:285) at org.apache.hadoop.hive.metastore.MetaStoreListenerNotifier.notifyEvent(MetaStoreListenerNotifier.java:347) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.drop_table_core(HiveMetaStore.java:2986) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.drop_table_with_environment_context(HiveMetaStore.java:3240) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.drop_table(HiveMetaStore.java:3227) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.drop_database_core(HiveMetaStore.java:1879) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.drop_database(HiveMetaStore.java:1978) at sun.reflect.GeneratedMethodAccessor369.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108) at com.sun.proxy.$Proxy63.drop_database(Unknown Source) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.dropDatabaseCascadePerDb(HiveMetaStoreClient.java:1393) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.dropDatabase(HiveMetaStoreClient.java:1324) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.dropDatabase(HiveMetaStoreClient.java:1277) at sun.reflect.GeneratedMethodAccessor368.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:213) at com.sun.proxy.$Proxy64.dropDatabase(Unknown Source) at org.apache.hadoop.hive.ql.metadata.Hive.dropDatabase(Hive.java:618) at org.apache.hadoop.hive.ql.ddl.database.drop.DropDatabaseOperation.execute(DropDatabaseOperation.java:50) at org.apache.hadoop.hive.ql.ddl.DDLTask.execute(DDLTask.java:80) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:213) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:105) at org.apache.hadoop.hive.ql.Executor.launchTask(Executor.java:357) at org.apache.hadoop.hive.ql.Executor.launchTasks(Executor.java:330) at org.apache.hadoop.hive.ql.Executor.runTasks(Executor.java:246) at org.apache.hadoop.hive.ql.Executor.execute(Executor.java:109) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:740) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:495) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:489) at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:166) at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:225)
It is not easy to reproduce, so there are possible connection leaks under some error conditions. This jira is to fix code that may cause connection leaks in TxnHandler.
Attachments
Issue Links
- is duplicated by
-
HIVE-25483 TxnHandler::acquireLock should close the DB conn to avoid connection leaks
- Resolved
- links to