This bug applies to multithreaded HDFS and Kudu scans.
So what happens is that we reserve an optional token for the first scanner thread but that can be taken by any other operator in the same fragment. What happens in one fragment in TPC-DS q18a is:
1. The hash join grabs an extra token for the join build. I guess it does this early so it gets an optional token before other fragments can grab them.
2. The scan node reserves an optional token in Open(). This optional token is already in use by the hash join.
3. The scan node tries to start the first scanner thread, but there are no optional tokens available, so it can't start any.
4. Eventually the optional token is given up and the scanner thread can start.
If #4 always happens without the scan making progress, then no deadlock is possible, but if there's any kind of circular dependency, this can deadlock.
Kudu scans also do not implement the num_scanner_threads query option in the same way as HDFS scans - the
IMPALA-2831 changes were not applied to kudu.