We're seeing a lot of issues with hadoop-qa related to threads or file descriptors.
Monitoring these counters would ease the analysis.
Note as well that
- if we want to execute the tests in the same jvm (because the test is small or because we want to share the cluster) we can't afford to leak too many resources
- if the tests leak, it's more difficult to detect a leak in the software itself.
I attach piece of code that I used. It requires two lines in a unit test class to:
- before every test, count the threads and the open file descriptor
- after every test, compare with the previous value.
I ran it on some tests; we have for example:
- client.TestMultiParallel#testBatchWithManyColsInOneRowGetAndPut: 232 threads (was 231), 390 file descriptors (was 390). => TestMultiParallel uses 232 threads!
- client.TestMultipleTimestamps#testWithColumnDeletes: 152 threads (was 151), 283 file descriptors (was 282).
- client.TestAdmin#testCheckHBaseAvailableClosesConnection: 477 threads (was 294), 815 file descriptors (was 461)
- client.TestMetaMigrationRemovingHTD#testMetaMigration: 149 threads (was 148), 310 file descriptors (was 307).
It's not always leaks, we can expect some pooling effects. But still...