The logic for disabling runtime filters based on stats is faulty. The issue is that the runtime filters are evaluated even before they arrive. This evaluation always returns true, which results in 'considered' being incremented but not 'rejected'. This in turns leads the logic for filter disabling concluding that the filter is ineffective. However, there is no way to know whether the filter is ineffective before it arrives.
I was able to reproduce this easily on TPC-H Q2 with scale factor 20 and runtime_filter_arrival_wait_time_ms=1 . I added logging to prove that the filters were being disabled before they arrived: