Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-24236

Connection leak in TxnHandler

    XMLWordPrintableJSON

Details

    Description

      We see failures in QE tests with cannot allocate connections errors. The exception stack like following:

      2020-09-29T18:44:26,563 INFO  [Heartbeater-0]: txn.TxnHandler (TxnHandler.java:checkRetryable(3733)) - Non-retryable error in heartbeat(HeartbeatRequest(lockid:0, txnid:11908)) : Cannot get a connection, general error (SQLState=null, ErrorCode=0)
      2020-09-29T18:44:26,564 ERROR [Heartbeater-0]: metastore.RetryingHMSHandler (RetryingHMSHandler.java:invokeInternal(201)) - MetaException(message:Unable to select from transaction database org.apache.commons.dbcp.SQLNestedException: Cannot get a connection, general error
              at org.apache.commons.dbcp.PoolingDataSource.getConnection(PoolingDataSource.java:118)
              at org.apache.hadoop.hive.metastore.txn.TxnHandler.getDbConn(TxnHandler.java:3605)
              at org.apache.hadoop.hive.metastore.txn.TxnHandler.getDbConn(TxnHandler.java:3598)
              at org.apache.hadoop.hive.metastore.txn.TxnHandler.heartbeat(TxnHandler.java:2739)
              at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.heartbeat(HiveMetaStore.java:8452)
              at sun.reflect.GeneratedMethodAccessor415.invoke(Unknown Source)
              at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
              at java.lang.reflect.Method.invoke(Method.java:498)
              at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
              at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)
              at com.sun.proxy.$Proxy63.heartbeat(Unknown Source)
              at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.heartbeat(HiveMetaStoreClient.java:3247)
              at sun.reflect.GeneratedMethodAccessor414.invoke(Unknown Source)
              at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
              at java.lang.reflect.Method.invoke(Method.java:498)
              at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:213)
              at com.sun.proxy.$Proxy64.heartbeat(Unknown Source)
              at org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.heartbeat(DbTxnManager.java:671)
              at org.apache.hadoop.hive.ql.lockmgr.DbTxnManager$Heartbeater.lambda$run$0(DbTxnManager.java:1102)
              at java.security.AccessController.doPrivileged(Native Method)
              at javax.security.auth.Subject.doAs(Subject.java:422)
              at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1898)
              at org.apache.hadoop.hive.ql.lockmgr.DbTxnManager$Heartbeater.run(DbTxnManager.java:1101)
              at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
              at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
              at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
              at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
              at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
              at java.lang.Thread.run(Thread.java:748)
      Caused by: java.lang.InterruptedException
              at java.lang.Object.wait(Native Method)
              at org.apache.commons.pool.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:1112)
              at org.apache.commons.dbcp.PoolingDataSource.getConnection(PoolingDataSource.java:106)
              ... 29 more
      )
              at org.apache.hadoop.hive.metastore.txn.TxnHandler.heartbeat(TxnHandler.java:2747)
              at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.heartbeat(HiveMetaStore.java:8452)
              at sun.reflect.GeneratedMethodAccessor415.invoke(Unknown Source)
      

      and

      Caused by: java.util.NoSuchElementException: Timeout waiting for idle object
              at org.apache.commons.pool.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:1134)
              at org.apache.commons.dbcp.PoolingDataSource.getConnection(PoolingDataSource.java:106)
              ... 53 more
      )
              at org.apache.hadoop.hive.metastore.txn.TxnHandler.cleanupRecords(TxnHandler.java:3375)
              at org.apache.hadoop.hive.metastore.AcidEventListener.onDropTable(AcidEventListener.java:65)
              at org.apache.hadoop.hive.metastore.MetaStoreListenerNotifier$19.notify(MetaStoreListenerNotifier.java:103)
              at org.apache.hadoop.hive.metastore.MetaStoreListenerNotifier.notifyEvent(MetaStoreListenerNotifier.java:285)
              at org.apache.hadoop.hive.metastore.MetaStoreListenerNotifier.notifyEvent(MetaStoreListenerNotifier.java:347)
              at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.drop_table_core(HiveMetaStore.java:2986)
              at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.drop_table_with_environment_context(HiveMetaStore.java:3240)
              at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.drop_table(HiveMetaStore.java:3227)
              at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.drop_database_core(HiveMetaStore.java:1879)
              at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.drop_database(HiveMetaStore.java:1978)
              at sun.reflect.GeneratedMethodAccessor369.invoke(Unknown Source)
              at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
              at java.lang.reflect.Method.invoke(Method.java:498)
              at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
              at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)
              at com.sun.proxy.$Proxy63.drop_database(Unknown Source)
              at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.dropDatabaseCascadePerDb(HiveMetaStoreClient.java:1393)
              at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.dropDatabase(HiveMetaStoreClient.java:1324)
              at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.dropDatabase(HiveMetaStoreClient.java:1277)
              at sun.reflect.GeneratedMethodAccessor368.invoke(Unknown Source)
              at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
              at java.lang.reflect.Method.invoke(Method.java:498)
              at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:213)
              at com.sun.proxy.$Proxy64.dropDatabase(Unknown Source)
              at org.apache.hadoop.hive.ql.metadata.Hive.dropDatabase(Hive.java:618)
              at org.apache.hadoop.hive.ql.ddl.database.drop.DropDatabaseOperation.execute(DropDatabaseOperation.java:50)
              at org.apache.hadoop.hive.ql.ddl.DDLTask.execute(DDLTask.java:80)
              at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:213)
              at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:105)
              at org.apache.hadoop.hive.ql.Executor.launchTask(Executor.java:357)
              at org.apache.hadoop.hive.ql.Executor.launchTasks(Executor.java:330)
              at org.apache.hadoop.hive.ql.Executor.runTasks(Executor.java:246)
              at org.apache.hadoop.hive.ql.Executor.execute(Executor.java:109)
              at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:740)
              at org.apache.hadoop.hive.ql.Driver.run(Driver.java:495)
              at org.apache.hadoop.hive.ql.Driver.run(Driver.java:489)
              at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:166)
              at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:225)
      

      It is not easy to reproduce, so there are possible connection leaks under some error conditions. This jira is to fix code that may cause connection leaks in TxnHandler.

      Attachments

        Issue Links

          Activity

            People

              ychena Yongzhi Chen
              ychena Yongzhi Chen
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 1.5h
                  1.5h