As of now, the onus is on the developer to explicitly close connections once they're done using them. Furthermore, since connections are managed based on the identity of the Configuration object, one is forced to clone the configuration object in order to be able to clean it up safely (for a case in point, see HTablePool's constructor). As a matter of fact, this issue has been well-documented in the HConnectionManager class:
But sharing connections makes clean up of HConnection instances a little awkward. Currently, clients cleanup by calling #deleteConnection(Configuration, boolean). This will shutdown the zookeeper connection the HConnection was using and clean up all HConnection resources as well as stopping proxies to servers out on the cluster. Not running the cleanup will not end the world; it'll just stall the closeup some and spew some zookeeper connection failed messages into the log. Running the cleanup on a HConnection that is subsequently used by another will cause breakage so be careful running cleanup. To create a HConnection that is not shared by others, you can create a new Configuration instance, pass this new instance to #getConnection(Configuration), and then when done, close it up by doing something like the following:
Here, we propose a reference-count based mechanism for managing connections that will allow HTables to clean up after themselves. In particular, we extend the HConnectionInterface interface so as to facilitate reference counting, where, typically, a reference indicates that it is being used by a HTable, although there could be other sources.
To elaborate, when a HTable is constructed, it increments the reference count on the connection given to it. Similarly, when it is closed, that reference count is decremented. In the event there are no more references to that connection, HTable#close takes it upon itself to delete the connection, thereby sparing the developer from doing so.