Details
-
Bug
-
Status: Open
-
Critical
-
Resolution: Unresolved
-
1.19.0
-
None
-
None
Description
Summary of issue:
- We are trying to inner join 2 MySQL tables.
- persons table: a very small table
- foodmart table: a large table
- The query produce different results if we change the join order in the "from" clause: "from A join B" vs "from B join A". One query produces correct results while the other query produces empty results.
- The physical plan generated for the produce-empty query was not optimal. it tried to scan be big table.
Working query and physical plan:
select "t2"."product_name", "t2"."customer_id" from "foodmart-mysql"."foodmart" as "t2" join "persons"."persons" as "t1" on"t1"."person_id_int"="t2"."customer_id" and"t2"."timestamp" >= '1997-02-12 00:00:00' and "t2"."timestamp" < '1997-02-12 00:01:00'
EnumerableCalc(expr#0..4=[\{inputs}], proj#0..1=[\{exprs}]): rowcount = 375.0, cumulative cost = {1287.971895621705 rows, 3719.5 cpu, 0.0 io}, id = 671 EnumerableJoin(condition=[=($1, $4)], joinType=[inner]): rowcount = 375.0, cumulative cost = {912.971895621705 rows, 1094.5 cpu, 0.0 io}, id = 667 JdbcToEnumerableConverter: rowcount = 25.0, cumulative cost = {147.5 rows, 283.5 cpu, 0.0 io}, id = 660 JdbcProject(product_name=[$5], customer_id=[$22], $f85=[>=($78, 1997-02-12 00:00:00)], $f86=[<($78, 1997-02-12 00:01:00)]): rowcount = 25.0, cumulative cost = {145.0 rows, 281.0 cpu, 0.0 io}, id = 658 JdbcFilter(condition=[AND(>=($78, 1997-02-12 00:00:00), <($78, 1997-02-12 00:01:00))]): rowcount = 25.0, cumulative cost = {125.0 rows, 201.0 cpu, 0.0 io}, id = 656 JdbcTableScan(table=[[foodmart-mysql, foodmart]]): rowcount = 100.0, cumulative cost = {100.0 rows, 101.0 cpu, 0.0 io}, id = 340 EnumerableCalc(expr#0..5=[\{inputs}], person_id_int=[$t5]): rowcount = 100.0, cumulative cost = {210.0 rows, 811.0 cpu, 0.0 io}, id = 673 JdbcToEnumerableConverter: rowcount = 100.0, cumulative cost = {110.0 rows, 111.0 cpu, 0.0 io}, id = 663 JdbcTableScan(table=[[persons, persons]]): rowcount = 100.0, cumulative cost = {100.0 rows, 101.0 cpu, 0.0 io}, id = 341
Produce-empty query and physical plan:
select "t2"."product_name", "t2"."customer_id" from "persons"."persons" as "t1" join "foodmart-mysql"."foodmart" as "t2" on "t1"."person_id_int"="t2"."customer_id" and"t2"."timestamp" >= '1997-02-12 00:00:00' and "t2"."timestamp" < '1997-02-12 00:01:00'
EnumerableCalc(expr#0..4=[\{inputs}], proj#0..1=[\{exprs}]): rowcount = 375.0, cumulative cost = {1255.471895621705 rows, 12427.0 cpu, 0.0 io}, id = 334 EnumerableJoin(condition=[=($1, $4)], joinType=[inner]): rowcount = 375.0, cumulative cost = {880.471895621705 rows, 9802.0 cpu, 0.0 io}, id = 328 EnumerableCalc(expr#0..84=[\{inputs}], expr#85=[1997-02-12 00:00:00], expr#86=[>=($t78, $t85)], expr#87=[1997-02-12 00:01:00], expr#88=[<($t78, $t87)], expr#89=[AND($t86, $t88)], product_name=[$t5], customer_id=[$t22], $f85=[$t86], $f86=[$t88], $condition=[$t89]): rowcount = 25.0, cumulative cost = {135.0 rows, 9611.0 cpu, 0.0 io}, id = 338 JdbcToEnumerableConverter: rowcount = 100.0, cumulative cost = {110.0 rows, 111.0 cpu, 0.0 io}, id = 317 JdbcTableScan(table=[[foodmart-mysql, foodmart]]): rowcount = 100.0, cumulative cost = {100.0 rows, 101.0 cpu, 0.0 io}, id = 1 JdbcToEnumerableConverter: rowcount = 100.0, cumulative cost = {190.0 rows, 191.0 cpu, 0.0 io}, id = 326 JdbcProject(person_id_int=[$5]): rowcount = 100.0, cumulative cost = {180.0 rows, 181.0 cpu, 0.0 io}, id = 324 JdbcTableScan(table=[[persons, persons]]): rowcount = 100.0, cumulative cost = {100.0 rows, 101.0 cpu, 0.0 io}, id = 0