Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
1.16.0
-
None
Description
TPCH query 17 with sf 1000 runs 45% slower. One issue is that the join order has flipped the build side and the probe side in Major Fragment 01.
Here is the query:
select
sum(l.l_extendedprice) / 7.0 as avg_yearly
from
lineitem l,
part p
where
p.p_partkey = l.l_partkey
and p.p_brand = 'Brand#13'
and p.p_container = 'JUMBO CAN'
and l.l_quantity < (
select
0.2 * avg(l2.l_quantity)
from
lineitem l2
where
l2.l_partkey = p.p_partkey
);
Here is original plan:
00-00 Screen : rowType = RecordType(ANY avg_yearly): rowcount = 1.0, cumulative cost = \{7.853786601428E10 rows, 6.6179786770537E11 cpu, 3.0599948545E10 io, 1.083019457355776E14 network, 1.17294998955024E11 memory}, id = 489493 00-01 Project(avg_yearly=[/($0, 7.0)]) : rowType = RecordType(ANY avg_yearly): rowcount = 1.0, cumulative cost = \{7.853786601418E10 rows, 6.6179786770527E11 cpu, 3.0599948545E10 io, 1.083019457355776E14 network, 1.17294998955024E11 memory}, id = 489492 00-02 StreamAgg(group=[{}], agg#0=[SUM($0)]) : rowType = RecordType(ANY $f0): rowcount = 1.0, cumulative cost = \{7.853786601318E10 rows, 6.6179786770127E11 cpu, 3.0599948545E10 io, 1.083019457355776E14 network, 1.17294998955024E11 memory}, id = 489491 00-03 UnionExchange : rowType = RecordType(ANY $f0): rowcount = 1.0, cumulative cost = \{7.853786601218E10 rows, 6.6179786768927E11 cpu, 3.0599948545E10 io, 1.083019457355776E14 network, 1.17294998955024E11 memory}, id = 489490 01-01 StreamAgg(group=[{}], agg#0=[SUM($0)]) : rowType = RecordType(ANY $f0): rowcount = 1.0, cumulative cost = \{7.853786601118E10 rows, 6.6179786768127E11 cpu, 3.0599948545E10 io, 1.083019457314816E14 network, 1.17294998955024E11 memory}, id = 489489 01-02 Project(l_extendedprice=[$1]) : rowType = RecordType(ANY l_extendedprice): rowcount = 2.9999948545E9, cumulative cost = \{7.553787115668E10 rows, 6.2579792942727E11 cpu, 3.0599948545E10 io, 1.083019457314816E14 network, 1.17294998955024E11 memory}, id = 489488 01-03 SelectionVectorRemover : rowType = RecordType(ANY l_quantity, ANY l_extendedprice, ANY p_partkey, ANY l_partkey, ANY $f1): rowcount = 2.9999948545E9, cumulative cost = \{7.253787630218E10 rows, 6.2279793457277E11 cpu, 3.0599948545E10 io, 1.083019457314816E14 network, 1.17294998955024E11 memory}, id = 489487 01-04 Filter(condition=[<($0, *(0.2, $4))]) : rowType = RecordType(ANY l_quantity, ANY l_extendedprice, ANY p_partkey, ANY l_partkey, ANY $f1): rowcount = 2.9999948545E9, cumulative cost = \{6.953788144768E10 rows, 6.1979793971827E11 cpu, 3.0599948545E10 io, 1.083019457314816E14 network, 1.17294998955024E11 memory}, id = 489486 01-05 HashJoin(condition=[=($2, $3)], joinType=[inner], semi-join: =[false]) : rowType = RecordType(ANY l_quantity, ANY l_extendedprice, ANY p_partkey, ANY l_partkey, ANY $f1): rowcount = 5.999989709E9, cumulative cost = \{6.353789173867999E10 rows, 5.8379800146427E11 cpu, 3.0599948545E10 io, 1.083019457314816E14 network, 1.17294998955024E11 memory}, id = 489485 01-07 Project(l_quantity=[$0], l_extendedprice=[$1], p_partkey=[$2]) : rowType = RecordType(ANY l_quantity, ANY l_extendedprice, ANY p_partkey): rowcount = 5.999989709E9, cumulative cost = \{4.2417927963E10 rows, 2.71618536905E11 cpu, 1.8599969127E10 io, 9.8471562592256E13 network, 7.92E7 memory}, id = 489476 01-09 HashToRandomExchange(dist0=[[$2]]) : rowType = RecordType(ANY l_quantity, ANY l_extendedprice, ANY p_partkey, ANY E_X_P_R_H_A_S_H_F_I_E_L_D): rowcount = 5.999989709E9, cumulative cost = \{3.6417938254E10 rows, 2.53618567778E11 cpu, 1.8599969127E10 io, 9.8471562592256E13 network, 7.92E7 memory}, id = 489475 02-01 UnorderedMuxExchange : rowType = RecordType(ANY l_quantity, ANY l_extendedprice, ANY p_partkey, ANY E_X_P_R_H_A_S_H_F_I_E_L_D): rowcount = 5.999989709E9, cumulative cost = \{3.0417948545E10 rows, 1.57618732434E11 cpu, 1.8599969127E10 io, 1.677312E11 network, 7.92E7 memory}, id = 489474 04-01 Project(l_quantity=[$0], l_extendedprice=[$1], p_partkey=[$2], E_X_P_R_H_A_S_H_F_I_E_L_D=[hash32AsDouble($2, 1301011)]) : rowType = RecordType(ANY l_quantity, ANY l_extendedprice, ANY p_partkey, ANY E_X_P_R_H_A_S_H_F_I_E_L_D): rowcount = 5.999989709E9, cumulative cost = \{2.4417958836E10 rows, 1.51618742725E11 cpu, 1.8599969127E10 io, 1.677312E11 network, 7.92E7 memory}, id = 489473 04-02 Project(l_quantity=[$1], l_extendedprice=[$2], p_partkey=[$3]) : rowType = RecordType(ANY l_quantity, ANY l_extendedprice, ANY p_partkey): rowcount = 5.999989709E9, cumulative cost = \{1.8417969127E10 rows, 1.09618814762E11 cpu, 1.8599969127E10 io, 1.677312E11 network, 7.92E7 memory}, id = 489472 04-03 HashJoin(condition=[=($0, $3)], joinType=[inner], semi-join: =[false]) : rowType = RecordType(ANY l_partkey, ANY l_quantity, ANY l_extendedprice, ANY p_partkey): rowcount = 5.999989709E9, cumulative cost = \{1.2417979418E10 rows, 9.1618845635E10 cpu, 1.8599969127E10 io, 1.677312E11 network, 7.92E7 memory}, id = 489471 04-05 Scan(table=[[dfs, tpchpar1000_micro, lineitem]], groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:///tpchParquet10/SF1000/lineitem]], selectionRoot=maprfs:/tpchParquet10/SF1000/lineitem, numFiles=1, numRowGroups=3250, usedMetadataFile=false, columns=[`l_partkey`, `l_quantity`, `l_extendedprice`]]]) : rowType = RecordType(ANY l_partkey, ANY l_quantity, ANY l_extendedprice): rowcount = 5.999989709E9, cumulative cost = \{5.999989709E9 rows, 1.7999969127E10 cpu, 1.7999969127E10 io, 0.0 network, 0.0 memory}, id = 489465 04-04 BroadcastExchange : rowType = RecordType(ANY p_partkey): rowcount = 4500000.0, cumulative cost = \{4.135E8 rows, 1.583E9 cpu, 6.0E8 io, 1.677312E11 network, 0.0 memory}, id = 489470 06-01 Project(p_partkey=[$0]) : rowType = RecordType(ANY p_partkey): rowcount = 4500000.0, cumulative cost = \{4.09E8 rows, 1.547E9 cpu, 6.0E8 io, 0.0 network, 0.0 memory}, id = 489469 06-02 SelectionVectorRemover : rowType = RecordType(ANY p_partkey, ANY p_brand, ANY p_container): rowcount = 4500000.0, cumulative cost = \{4.045E8 rows, 1.5425E9 cpu, 6.0E8 io, 0.0 network, 0.0 memory}, id = 489468 06-03 Filter(condition=[AND(=($1, 'Brand#13'), =($2, 'JUMBO CAN'))]) : rowType = RecordType(ANY p_partkey, ANY p_brand, ANY p_container): rowcount = 4500000.0, cumulative cost = \{4.0E8 rows, 1.538E9 cpu, 6.0E8 io, 0.0 network, 0.0 memory}, id = 489467 06-04 Scan(table=[[dfs, tpchpar1000_micro, part]], groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:///tpchParquet10/SF1000/part]], selectionRoot=maprfs:/tpchParquet10/SF1000/part, numFiles=1, numRowGroups=90, usedMetadataFile=false, columns=[`p_partkey`, `p_brand`, `p_container`]]]) : rowType = RecordType(ANY p_partkey, ANY p_brand, ANY p_container): rowcount = 2.0E8, cumulative cost = \{2.0E8 rows, 6.0E8 cpu, 6.0E8 io, 0.0 network, 0.0 memory}, id = 489466 01-06 Project(l_partkey=[$0], $f1=[divide(CastHigh(CASE(=($2, 0), null, $1)), $2)]) : rowType = RecordType(ANY l_partkey, ANY $f1): rowcount = 5.9999897089999996E7, cumulative cost = \{1.5059974169589998E10 rows, 2.3969958887455E11 cpu, 1.1999979418E10 io, 9.8303831392256E12 network, 1.1615980076624E11 memory}, id = 489484 01-08 HashAgg(group=[\{0}], agg#0=[$SUM0($1)], agg#1=[$SUM0($2)]) : rowType = RecordType(ANY l_partkey, ANY $f1, BIGINT $f2): rowcount = 5.9999897089999996E7, cumulative cost = \{1.4999974272499998E10 rows, 2.3939958938909998E11 cpu, 1.1999979418E10 io, 9.8303831392256E12 network, 1.1615980076624E11 memory}, id = 489483 01-10 Project(l_partkey=[$0], $f1=[$1], $f2=[$2]) : rowType = RecordType(ANY l_partkey, ANY $f1, BIGINT $f2): rowcount = 5.999989709E8, cumulative cost = \{1.4399975301599998E10 rows, 2.201996223203E11 cpu, 1.1999979418E10 io, 9.8303831392256E12 network, 1.0559981887840001E11 memory}, id = 489482 01-11 HashToRandomExchange(dist0=[[$0]]) : rowType = RecordType(ANY l_partkey, ANY $f1, BIGINT $f2, ANY E_X_P_R_H_A_S_H_F_I_E_L_D): rowcount = 5.999989709E8, cumulative cost = \{1.3799976330699999E10 rows, 2.1839962540759998E11 cpu, 1.1999979418E10 io, 9.8303831392256E12 network, 1.0559981887840001E11 memory}, id = 489481 03-01 UnorderedMuxExchange : rowType = RecordType(ANY l_partkey, ANY $f1, BIGINT $f2, ANY E_X_P_R_H_A_S_H_F_I_E_L_D): rowcount = 5.999989709E8, cumulative cost = \{1.31999773598E10 rows, 2.0879964187319998E11 cpu, 1.1999979418E10 io, 0.0 network, 1.0559981887840001E11 memory}, id = 489480 05-01 Project(l_partkey=[$0], $f1=[$1], $f2=[$2], E_X_P_R_H_A_S_H_F_I_E_L_D=[hash32AsDouble($0, 1301011)]) : rowType = RecordType(ANY l_partkey, ANY $f1, BIGINT $f2, ANY E_X_P_R_H_A_S_H_F_I_E_L_D): rowcount = 5.999989709E8, cumulative cost = \{1.25999783889E10 rows, 2.081996429023E11 cpu, 1.1999979418E10 io, 0.0 network, 1.0559981887840001E11 memory}, id = 489479 05-02 HashAgg(group=[\{0}], agg#0=[$SUM0($1)], agg#1=[COUNT($1)]) : rowType = RecordType(ANY l_partkey, ANY $f1, BIGINT $f2): rowcount = 5.999989709E8, cumulative cost = \{1.1999979418E10 rows, 2.03999650106E11 cpu, 1.1999979418E10 io, 0.0 network, 1.0559981887840001E11 memory}, id = 489478 05-03 Scan(table=[[dfs, tpchpar1000_micro, lineitem]], groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:///tpchParquet10/SF1000/lineitem]], selectionRoot=maprfs:/tpchParquet10/SF1000/lineitem, numFiles=1, numRowGroups=3250, usedMetadataFile=false, columns=[`l_partkey`, `l_quantity`]]]) : rowType = RecordType(ANY l_partkey, ANY l_quantity): rowcount = 5.999989709E9, cumulative cost = \{5.999989709E9 rows, 1.1999979418E10 cpu, 1.1999979418E10 io, 0.0 network, 0.0 memory}, id = 489477
Here is the new plan:
00-00 Screen : rowType = RecordType(ANY avg_yearly): rowcount = 1.0, cumulative cost = \{2.589042618686726E10 rows, 3.1133328060351746E11 cpu, 3.0599948545E10 io, 3.4598519346958716E12 network, 1.0931196869447409E11 memory}, id = 62719 00-01 Project(avg_yearly=[/($0, 7.0)]) : rowType = RecordType(ANY avg_yearly): rowcount = 1.0, cumulative cost = \{2.589042618676726E10 rows, 3.113332806034175E11 cpu, 3.0599948545E10 io, 3.4598519346958716E12 network, 1.0931196869447409E11 memory}, id = 62718 00-02 StreamAgg(group=[{}], agg#0=[SUM($0)]) : rowType = RecordType(ANY $f0): rowcount = 1.0, cumulative cost = \{2.589042618576726E10 rows, 3.113332805994175E11 cpu, 3.0599948545E10 io, 3.4598519346958716E12 network, 1.0931196869447409E11 memory}, id = 62717 00-03 UnionExchange : rowType = RecordType(ANY $f0): rowcount = 1.0, cumulative cost = \{2.589042618476726E10 rows, 3.113332805874175E11 cpu, 3.0599948545E10 io, 3.4598519346958716E12 network, 1.0931196869447409E11 memory}, id = 62716 01-01 StreamAgg(group=[{}], agg#0=[SUM($0)]) : rowType = RecordType(ANY $f0): rowcount = 1.0, cumulative cost = \{2.589042618376726E10 rows, 3.113332805794175E11 cpu, 3.0599948545E10 io, 3.4598519305998716E12 network, 1.0931196869447409E11 memory}, id = 62715 01-02 Project(l_extendedprice=[$1]) : rowType = RecordType(ANY l_extendedprice): rowcount = 2928825.0930136647, cumulative cost = \{2.5887497358674248E10 rows, 3.1129813467830133E11 cpu, 3.0599948545E10 io, 3.4598519305998716E12 network, 1.0931196869447409E11 memory}, id = 62714 01-03 SelectionVectorRemover : rowType = RecordType(ANY l_quantity, ANY l_extendedprice, ANY p_partkey, ANY l_partkey, ANY $f1): rowcount = 2928825.0930136647, cumulative cost = \{2.5884568533581234E10 rows, 3.112952058532083E11 cpu, 3.0599948545E10 io, 3.4598519305998716E12 network, 1.0931196869447409E11 memory}, id = 62713 01-04 Filter(condition=[<($0, *(0.2, $4))]) : rowType = RecordType(ANY l_quantity, ANY l_extendedprice, ANY p_partkey, ANY l_partkey, ANY $f1): rowcount = 2928825.0930136647, cumulative cost = \{2.588163970848822E10 rows, 3.112922770281153E11 cpu, 3.0599948545E10 io, 3.4598519305998716E12 network, 1.0931196869447409E11 memory}, id = 62712 01-05 Project(l_quantity=[$2], l_extendedprice=[$3], p_partkey=[$4], l_partkey=[$0], $f1=[$1]) : rowType = RecordType(ANY l_quantity, ANY l_extendedprice, ANY p_partkey, ANY l_partkey, ANY $f1): rowcount = 5857650.186027329, cumulative cost = \{2.5875782058302193E10 rows, 3.1125713112699915E11 cpu, 3.0599948545E10 io, 3.4598519305998716E12 network, 1.0931196869447409E11 memory}, id = 62711 01-06 HashJoin(condition=[=($4, $0)], joinType=[inner], semi-join: =[false]) : rowType = RecordType(ANY l_partkey, ANY $f1, ANY l_quantity, ANY l_extendedprice, ANY p_partkey): rowcount = 5857650.186027329, cumulative cost = \{2.5869924408116165E10 rows, 3.1122784287606903E11 cpu, 3.0599948545E10 io, 3.4598519305998716E12 network, 1.0931196869447409E11 memory}, id = 62710 01-08 Project(l_partkey=[$0], $f1=[divide(CastHigh(CASE(=($2, 0), null, $1)), $2)]) : rowType = RecordType(ANY l_partkey, ANY $f1): rowcount = 2.04859953E8, cumulative cost = \{1.3229139136E10 rows, 2.17110687098E11 cpu, 1.1999979418E10 io, 3.356425469952E12 network, 1.0920535405120001E11 memory}, id = 62697 01-10 HashAgg(group=[\{0}], agg#0=[$SUM0($1)], agg#1=[$SUM0($2)]) : rowType = RecordType(ANY l_partkey, ANY $f1, BIGINT $f2): rowcount = 2.04859953E8, cumulative cost = \{1.3024279183E10 rows, 2.16086387333E11 cpu, 1.1999979418E10 io, 3.356425469952E12 network, 1.0920535405120001E11 memory}, id = 62696 01-11 Project(l_partkey=[$0], $f1=[$1], $f2=[$2]) : rowType = RecordType(ANY l_partkey, ANY $f1, BIGINT $f2): rowcount = 2.04859953E8, cumulative cost = \{1.281941923E10 rows, 2.09530868837E11 cpu, 1.1999979418E10 io, 3.356425469952E12 network, 1.0559981887840001E11 memory}, id = 62695 01-12 HashToRandomExchange(dist0=[[$0]]) : rowType = RecordType(ANY l_partkey, ANY $f1, BIGINT $f2, ANY E_X_P_R_H_A_S_H_F_I_E_L_D): rowcount = 2.04859953E8, cumulative cost = \{1.2614559277E10 rows, 2.08916288978E11 cpu, 1.1999979418E10 io, 3.356425469952E12 network, 1.0559981887840001E11 memory}, id = 62694 02-01 UnorderedMuxExchange : rowType = RecordType(ANY l_partkey, ANY $f1, BIGINT $f2, ANY E_X_P_R_H_A_S_H_F_I_E_L_D): rowcount = 2.04859953E8, cumulative cost = \{1.2409699324E10 rows, 2.0563852973E11 cpu, 1.1999979418E10 io, 0.0 network, 1.0559981887840001E11 memory}, id = 62693 04-01 Project(l_partkey=[$0], $f1=[$1], $f2=[$2], E_X_P_R_H_A_S_H_F_I_E_L_D=[hash32AsDouble($0, 1301011)]) : rowType = RecordType(ANY l_partkey, ANY $f1, BIGINT $f2, ANY E_X_P_R_H_A_S_H_F_I_E_L_D): rowcount = 2.04859953E8, cumulative cost = \{1.2204839371E10 rows, 2.05433669777E11 cpu, 1.1999979418E10 io, 0.0 network, 1.0559981887840001E11 memory}, id = 62692 04-02 HashAgg(group=[\{0}], agg#0=[$SUM0($1)], agg#1=[COUNT($1)]) : rowType = RecordType(ANY l_partkey, ANY $f1, BIGINT $f2): rowcount = 2.04859953E8, cumulative cost = \{1.1999979418E10 rows, 2.03999650106E11 cpu, 1.1999979418E10 io, 0.0 network, 1.0559981887840001E11 memory}, id = 62691 04-03 Scan(table=[[dfs, tpchpar1000_micro, lineitem]], groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:///tpchParquet10/SF1000/lineitem]], selectionRoot=maprfs:/tpchParquet10/SF1000/lineitem, numFiles=1, numRowGroups=3250, usedMetadataFile=false, columns=[`l_partkey`, `l_quantity`]]]) : rowType = RecordType(ANY l_partkey, ANY l_quantity): rowcount = 5.999989709E9, cumulative cost = \{5.999989709E9 rows, 1.1999979418E10 cpu, 1.1999979418E10 io, 0.0 network, 0.0 memory}, id = 62690 01-07 Project(l_quantity=[$0], l_extendedprice=[$1], p_partkey=[$2]) : rowType = RecordType(ANY l_quantity, ANY l_extendedprice, ANY p_partkey): rowcount = 5857650.186027329, cumulative cost = \{1.2430067668930138E10 rows, 9.16119751405808E10 cpu, 1.8599969127E10 io, 1.0342646064787177E11 network, 3520000.0000000005 memory}, id = 62709 01-09 HashToRandomExchange(dist0=[[$2]]) : rowType = RecordType(ANY l_quantity, ANY l_extendedprice, ANY p_partkey, ANY E_X_P_R_H_A_S_H_F_I_E_L_D): rowcount = 5857650.186027329, cumulative cost = \{1.242421001874411E10 rows, 9.159440219002272E10 cpu, 1.8599969127E10 io, 1.0342646064787177E11 network, 3520000.0000000005 memory}, id = 62708 03-01 UnorderedMuxExchange : rowType = RecordType(ANY l_quantity, ANY l_extendedprice, ANY p_partkey, ANY E_X_P_R_H_A_S_H_F_I_E_L_D): rowcount = 5857650.186027329, cumulative cost = \{1.2418352368558083E10 rows, 9.150067978704628E10 cpu, 1.8599969127E10 io, 7.45472E9 network, 3520000.0000000005 memory}, id = 62707 05-01 Project(l_quantity=[$0], l_extendedprice=[$1], p_partkey=[$2], E_X_P_R_H_A_S_H_F_I_E_L_D=[hash32AsDouble($2, 1301011)]) : rowType = RecordType(ANY l_quantity, ANY l_extendedprice, ANY p_partkey, ANY E_X_P_R_H_A_S_H_F_I_E_L_D): rowcount = 5857650.186027329, cumulative cost = \{1.2412494718372055E10 rows, 9.149482213686026E10 cpu, 1.8599969127E10 io, 7.45472E9 network, 3520000.0000000005 memory}, id = 62706 05-02 Project(l_quantity=[$1], l_extendedprice=[$2], p_partkey=[$3]) : rowType = RecordType(ANY l_quantity, ANY l_extendedprice, ANY p_partkey): rowcount = 5857650.186027329, cumulative cost = \{1.2406637068186028E10 rows, 9.145381858555807E10 cpu, 1.8599969127E10 io, 7.45472E9 network, 3520000.0000000005 memory}, id = 62705 05-03 HashJoin(condition=[=($0, $3)], joinType=[inner], semi-join: =[false]) : rowType = RecordType(ANY l_partkey, ANY l_quantity, ANY l_extendedprice, ANY p_partkey): rowcount = 5857650.186027329, cumulative cost = \{1.2400779418E10 rows, 9.1436245635E10 cpu, 1.8599969127E10 io, 7.45472E9 network, 3520000.0000000005 memory}, id = 62704 05-05 Scan(table=[[dfs, tpchpar1000_micro, lineitem]], groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:///tpchParquet10/SF1000/lineitem]], selectionRoot=maprfs:/tpchParquet10/SF1000/lineitem, numFiles=1, numRowGroups=3250, usedMetadataFile=false, columns=[`l_partkey`, `l_quantity`, `l_extendedprice`]]]) : rowType = RecordType(ANY l_partkey, ANY l_quantity, ANY l_extendedprice): rowcount = 5.999989709E9, cumulative cost = \{5.999989709E9 rows, 1.7999969127E10 cpu, 1.7999969127E10 io, 0.0 network, 0.0 memory}, id = 62698 05-04 BroadcastExchange : rowType = RecordType(ANY p_partkey): rowcount = 200000.0, cumulative cost = \{4.006E8 rows, 1.4348E9 cpu, 6.0E8 io, 7.45472E9 network, 0.0 memory}, id = 62703 06-01 Project(p_partkey=[$0]) : rowType = RecordType(ANY p_partkey): rowcount = 200000.0, cumulative cost = \{4.004E8 rows, 1.4332E9 cpu, 6.0E8 io, 0.0 network, 0.0 memory}, id = 62702 06-02 SelectionVectorRemover : rowType = RecordType(ANY p_partkey, ANY p_brand, ANY p_container): rowcount = 200000.0, cumulative cost = \{4.002E8 rows, 1.433E9 cpu, 6.0E8 io, 0.0 network, 0.0 memory}, id = 62701 06-03 Filter(condition=[AND(=($1, 'Brand#13'), =($2, 'JUMBO CAN'))]) : rowType = RecordType(ANY p_partkey, ANY p_brand, ANY p_container): rowcount = 200000.0, cumulative cost = \{4.0E8 rows, 1.4328E9 cpu, 6.0E8 io, 0.0 network, 0.0 memory}, id = 62700 06-04 Scan(table=[[dfs, tpchpar1000_micro, part]], groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:///tpchParquet10/SF1000/part]], selectionRoot=maprfs:/tpchParquet10/SF1000/part, numFiles=1, numRowGroups=90, usedMetadataFile=false, columns=[`p_partkey`, `p_brand`, `p_container`]]]) : rowType = RecordType(ANY p_partkey, ANY p_brand, ANY p_container): rowcount = 2.0E8, cumulative cost = \{2.0E8 rows, 6.0E8 cpu, 6.0E8 io, 0.0 network, 0.0 memory}, id = 62699
I have attached two profiles. /2384d66b-1b93-6fe1-8abe-34cc74994138 is from commit id 4627973bde9847a4eb2672c44941136c167326a1. This does not have Statistics code and serves as the baseline. It is the commit prior to the Statistics commit. 23650ee5-6721-8a8f-7dd3-f5dd09a3a7b0 is from commit id 212e5c0d9656cd572426aa514bf37e0bd002bdd6. This has the Statistics code. This has the fix for DRILL-7109.
Attachments
Issue Links
- Dependent
-
DRILL-7227 TPCDS queries 47, 57, 59 fail to run with Statistics enabled at sf100
- Resolved
- links to