Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
1.15.0
-
None
Description
When testing with the following combination on TPC-H dataset (scale factor 100) using a 4 node setup...
exec.hashjoin.bloom_filter.fpp=0.2
exec.hashjoin.enable.runtime_filter=true
exec.hashjoin.runtime_filter.max.waiting.time=20000
It was observed that the filter eliminates more rows than it should.
0: jdbc:drill:schema=dfs.par100> select count(*) from (select * from lineitem l, supplier s where l.l_suppkey = s.s_suppkey and s.s_acctbal <1000); +---------+ | EXPR$0 | +---------+ | 405566 | +---------+ 1 row selected (10.565 seconds) 0: jdbc:drill:schema=dfs.par100> select count(*) from (select * from lineitem l, supplier s where l.l_suppkey = s.s_suppkey and s.s_acctbal <1000); +---------+ | EXPR$0 | +---------+ | 405769 | +---------+ 1 row selected (9.845 seconds)
The expected row count for the above (broadcast-join) query should have been 109307880
0: jdbc:drill:schema=dfs.par100> select count(*) from (select * from lineitem l, orders o where o.o_orderkey = l.l_orderkey and o.o_totalprice < 100000); +-----------+ | EXPR$0 | +-----------+ | 37338355 | +-----------+ 1 row selected (44.698 seconds) 0: jdbc:drill:schema=dfs.par100> select count(*) from (select * from lineitem l, orders o where o.o_orderkey = l.l_orderkey and o.o_totalprice < 100000); +-----------+ | EXPR$0 | +-----------+ | 38044874 | +-----------+ 1 row selected (44.871 seconds)
The expected row count for the above (hash partition-join) query should have been 96176495
Attachments
Attachments
Issue Links
- requires
-
DRILL-7016 Wrong query result with RuntimeFilter enabled when order of join and filter condition is swapped
- Resolved