Following query fails when with Error: UNSUPPORTED_OPERATION ERROR: Hash-Join can not partition the inner data any further (probably due to too many join-key duplicates) on TPC-H SF100 data.
set `exec.hashjoin.enable.runtime_filter` = true;
set `exec.hashjoin.runtime_filter.max.waiting.time` = 10000;
set `planner.enable_broadcast_join` = false;
l1.l_discount IN (
distinct(cast(l2.l_discount as double))
The subquery contains distinct keyword and hence there should not be duplicate values.
I suspect that the failure is caused by semijoin because the query succeeds when semijoin is disabled explicitly.