Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
Description
The commit 20210de which was part of HIVE-15546 introduced a thread pool which is not shutdown upon completion of its threads. This leads to a leak of threads for each query which uses more than 1 partition. They are not removed automatically. When queries spanning multiple partitions are made the number of threads increases and is never reduced. On my machine hiveserver2 starts to get slower and slower once 10k threads are reached.
Thread pools only shutdown automatically in special circumstances (see documentation section Finalization). This is not currently the case for the Get-Input-Paths thread pool. I would add a pool.shutdown() in a finally block just before returning the result to make sure the threads are really shutdown.
My current workaround is to set hive.exec.input.listing.max.threads = 1. This prevents the the thread pool from being spawned [1] [2].
The same issue probably also applies to the Get-Input-Summary thread pool.
Attachments
Attachments
Issue Links
- is broken by
-
HIVE-15546 Optimize Utilities.getInputPaths() so each listStatus of a partition is done in parallel
- Resolved
- links to