Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-24236

Connection leak in TxnHandler

    XMLWordPrintableJSON

    Details

      Description

      We see failures in QE tests with cannot allocate connections errors. The exception stack like following:

      2020-09-29T18:44:26,563 INFO  [Heartbeater-0]: txn.TxnHandler (TxnHandler.java:checkRetryable(3733)) - Non-retryable error in heartbeat(HeartbeatRequest(lockid:0, txnid:11908)) : Cannot get a connection, general error (SQLState=null, ErrorCode=0)
      2020-09-29T18:44:26,564 ERROR [Heartbeater-0]: metastore.RetryingHMSHandler (RetryingHMSHandler.java:invokeInternal(201)) - MetaException(message:Unable to select from transaction database org.apache.commons.dbcp.SQLNestedException: Cannot get a connection, general error
              at org.apache.commons.dbcp.PoolingDataSource.getConnection(PoolingDataSource.java:118)
              at org.apache.hadoop.hive.metastore.txn.TxnHandler.getDbConn(TxnHandler.java:3605)
              at org.apache.hadoop.hive.metastore.txn.TxnHandler.getDbConn(TxnHandler.java:3598)
              at org.apache.hadoop.hive.metastore.txn.TxnHandler.heartbeat(TxnHandler.java:2739)
              at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.heartbeat(HiveMetaStore.java:8452)
              at sun.reflect.GeneratedMethodAccessor415.invoke(Unknown Source)
              at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
              at java.lang.reflect.Method.invoke(Method.java:498)
              at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
              at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)
              at com.sun.proxy.$Proxy63.heartbeat(Unknown Source)
              at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.heartbeat(HiveMetaStoreClient.java:3247)
              at sun.reflect.GeneratedMethodAccessor414.invoke(Unknown Source)
              at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
              at java.lang.reflect.Method.invoke(Method.java:498)
              at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:213)
              at com.sun.proxy.$Proxy64.heartbeat(Unknown Source)
              at org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.heartbeat(DbTxnManager.java:671)
              at org.apache.hadoop.hive.ql.lockmgr.DbTxnManager$Heartbeater.lambda$run$0(DbTxnManager.java:1102)
              at java.security.AccessController.doPrivileged(Native Method)
              at javax.security.auth.Subject.doAs(Subject.java:422)
              at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1898)
              at org.apache.hadoop.hive.ql.lockmgr.DbTxnManager$Heartbeater.run(DbTxnManager.java:1101)
              at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
              at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
              at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
              at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
              at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
              at java.lang.Thread.run(Thread.java:748)
      Caused by: java.lang.InterruptedException
              at java.lang.Object.wait(Native Method)
              at org.apache.commons.pool.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:1112)
              at org.apache.commons.dbcp.PoolingDataSource.getConnection(PoolingDataSource.java:106)
              ... 29 more
      )
              at org.apache.hadoop.hive.metastore.txn.TxnHandler.heartbeat(TxnHandler.java:2747)
              at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.heartbeat(HiveMetaStore.java:8452)
              at sun.reflect.GeneratedMethodAccessor415.invoke(Unknown Source)
      

      and

      Caused by: java.util.NoSuchElementException: Timeout waiting for idle object
              at org.apache.commons.pool.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:1134)
              at org.apache.commons.dbcp.PoolingDataSource.getConnection(PoolingDataSource.java:106)
              ... 53 more
      )
              at org.apache.hadoop.hive.metastore.txn.TxnHandler.cleanupRecords(TxnHandler.java:3375)
              at org.apache.hadoop.hive.metastore.AcidEventListener.onDropTable(AcidEventListener.java:65)
              at org.apache.hadoop.hive.metastore.MetaStoreListenerNotifier$19.notify(MetaStoreListenerNotifier.java:103)
              at org.apache.hadoop.hive.metastore.MetaStoreListenerNotifier.notifyEvent(MetaStoreListenerNotifier.java:285)
              at org.apache.hadoop.hive.metastore.MetaStoreListenerNotifier.notifyEvent(MetaStoreListenerNotifier.java:347)
              at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.drop_table_core(HiveMetaStore.java:2986)
              at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.drop_table_with_environment_context(HiveMetaStore.java:3240)
              at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.drop_table(HiveMetaStore.java:3227)
              at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.drop_database_core(HiveMetaStore.java:1879)
              at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.drop_database(HiveMetaStore.java:1978)
              at sun.reflect.GeneratedMethodAccessor369.invoke(Unknown Source)
              at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
              at java.lang.reflect.Method.invoke(Method.java:498)
              at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
              at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)
              at com.sun.proxy.$Proxy63.drop_database(Unknown Source)
              at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.dropDatabaseCascadePerDb(HiveMetaStoreClient.java:1393)
              at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.dropDatabase(HiveMetaStoreClient.java:1324)
              at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.dropDatabase(HiveMetaStoreClient.java:1277)
              at sun.reflect.GeneratedMethodAccessor368.invoke(Unknown Source)
              at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
              at java.lang.reflect.Method.invoke(Method.java:498)
              at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:213)
              at com.sun.proxy.$Proxy64.dropDatabase(Unknown Source)
              at org.apache.hadoop.hive.ql.metadata.Hive.dropDatabase(Hive.java:618)
              at org.apache.hadoop.hive.ql.ddl.database.drop.DropDatabaseOperation.execute(DropDatabaseOperation.java:50)
              at org.apache.hadoop.hive.ql.ddl.DDLTask.execute(DDLTask.java:80)
              at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:213)
              at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:105)
              at org.apache.hadoop.hive.ql.Executor.launchTask(Executor.java:357)
              at org.apache.hadoop.hive.ql.Executor.launchTasks(Executor.java:330)
              at org.apache.hadoop.hive.ql.Executor.runTasks(Executor.java:246)
              at org.apache.hadoop.hive.ql.Executor.execute(Executor.java:109)
              at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:740)
              at org.apache.hadoop.hive.ql.Driver.run(Driver.java:495)
              at org.apache.hadoop.hive.ql.Driver.run(Driver.java:489)
              at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:166)
              at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:225)
      

      It is not easy to reproduce, so there are possible connection leaks under some error conditions. This jira is to fix code that may cause connection leaks in TxnHandler.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                ychena Yongzhi Chen
                Reporter:
                ychena Yongzhi Chen
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 1.5h
                  1.5h