CPU intensive threads are supposed to be allocated by the ThreadResourceMgr to keep track of the total number of threads that may be allocated between queries on the host. However, when RM is enabled, threads need to be reported to the QueryResourceMgr so that additional VCPUs can be requested from Yarn.
The blocking join operator currently attempts to get a thread from the ThreadResourceMgr, but if it is allocated, it doesn't report it to the QueryResourceMgr. This results in Impala using more VCPUs than it has been allocated by Yarn. Unfortunately the fix is more involved than simply reporting the thread to the QRM because it may cause a child scan node to be starved of the thread that it is supposed to have, thus causing the query to hang when it can't start any scan threads. A quick fix may involve changing the way the scan node handles its first scanner thread, but we should consider rethinking the CPU thread management duality presented by the ThreadResourceMgr and the QueryResourceMgr.