Details
-
Bug
-
Status: Resolved
-
Blocker
-
Resolution: Fixed
-
1.16.0
Description
Here is TPCH 04 with sf 1000:
select o.o_orderpriority, count(*) as order_count from orders o where o.o_orderdate >= date '1996-10-01' and o.o_orderdate < date '1996-10-01' + interval '3' month and exists ( select * from lineitem l where l.l_orderkey = o.o_orderkey and l.l_commitdate < l.l_receiptdate ) group by o.o_orderpriority order by o.o_orderpriority;
TPCH query 4 takes 30% longer. The plan is the same. But the Hash Agg operator in the new plan is taking longer. One possible reason is that the Hash Agg operator in the new plan is not using as many buckets as the old plan did. The memory usage of the Hash Agg operator in the new plan is using less memory compared to the old plan.
Here is the old plan:
00-00 Screen : rowType = RecordType(ANY o_orderpriority, BIGINT order_count): rowcount = 3750000.0, cumulative cost = {1.9163601940441746E10 rows, 9.07316867594483E10 cpu, 2.2499969127E10 io, 3.59423968386048E12 network, 2.2631985057468002E10 memory}, id = 5645 00-01 Project(o_orderpriority=[$0], order_count=[$1]) : rowType = RecordType(ANY o_orderpriority, BIGINT order_count): rowcount = 3750000.0, cumulative cost = {1.9163226940441746E10 rows, 9.07313117594483E10 cpu, 2.2499969127E10 io, 3.59423968386048E12 network, 2.2631985057468002E10 memory}, id = 5644 00-02 SingleMergeExchange(sort0=[0]) : rowType = RecordType(ANY o_orderpriority, BIGINT order_count): rowcount = 3750000.0, cumulative cost = {1.9159476940441746E10 rows, 9.07238117594483E10 cpu, 2.2499969127E10 io, 3.59423968386048E12 network, 2.2631985057468002E10 memory}, id = 5643 01-01 OrderedMuxExchange(sort0=[0]) : rowType = RecordType(ANY o_orderpriority, BIGINT order_count): rowcount = 3750000.0, cumulative cost = {1.9155726940441746E10 rows, 9.0643982838025E10 cpu, 2.2499969127E10 io, 3.56351968386048E12 network, 2.2631985057468002E10 memory}, id = 5642 02-01 SelectionVectorRemover : rowType = RecordType(ANY o_orderpriority, BIGINT order_count): rowcount = 3750000.0, cumulative cost = {1.9151976940441746E10 rows, 9.0640232838025E10 cpu, 2.2499969127E10 io, 3.56351968386048E12 network, 2.2631985057468002E10 memory}, id = 5641 02-02 Sort(sort0=[$0], dir0=[ASC]) : rowType = RecordType(ANY o_orderpriority, BIGINT order_count): rowcount = 3750000.0, cumulative cost = {1.9148226940441746E10 rows, 9.0636482838025E10 cpu, 2.2499969127E10 io, 3.56351968386048E12 network, 2.2631985057468002E10 memory}, id = 5640 02-03 HashAgg(group=[{0}], order_count=[$SUM0($1)]) : rowType = RecordType(ANY o_orderpriority, BIGINT order_count): rowcount = 3750000.0, cumulative cost = {1.9144476940441746E10 rows, 9.030890595055101E10 cpu, 2.2499969127E10 io, 3.56351968386048E12 network, 2.2571985057468002E10 memory}, id = 5639 02-04 HashToRandomExchange(dist0=[[$0]]) : rowType = RecordType(ANY o_orderpriority, BIGINT order_count): rowcount = 3.75E7, cumulative cost = {1.9106976940441746E10 rows, 8.955890595055101E10 cpu, 2.2499969127E10 io, 3.56351968386048E12 network, 2.1911985057468002E10 memory}, id = 5638 03-01 HashAgg(group=[{0}], order_count=[COUNT()]) : rowType = RecordType(ANY o_orderpriority, BIGINT order_count): rowcount = 3.75E7, cumulative cost = {1.9069476940441746E10 rows, 8.895890595055101E10 cpu, 2.2499969127E10 io, 3.25631968386048E12 network, 2.1911985057468002E10 memory}, id = 5637 03-02 Project(o_orderpriority=[$1]) : rowType = RecordType(ANY o_orderpriority): rowcount = 3.75E8, cumulative cost = {1.8694476940441746E10 rows, 8.145890595055101E10 cpu, 2.2499969127E10 io, 3.25631968386048E12 network, 1.5311985057468002E10 memory}, id = 5636 03-03 Project(o_orderkey=[$1], o_orderpriority=[$2], l_orderkey=[$0]) : rowType = RecordType(ANY o_orderkey, ANY o_orderpriority, ANY l_orderkey): rowcount = 3.75E8, cumulative cost = {1.8319476940441746E10 rows, 8.108390595055101E10 cpu, 2.2499969127E10 io, 3.25631968386048E12 network, 1.5311985057468002E10 memory}, id = 5635 03-04 HashJoin(condition=[=($1, $0)], joinType=[inner], semi-join: =[false]) : rowType = RecordType(ANY l_orderkey, ANY o_orderkey, ANY o_orderpriority): rowcount = 3.75E8, cumulative cost = {1.7944476940441746E10 rows, 7.995890595055101E10 cpu, 2.2499969127E10 io, 3.25631968386048E12 network, 1.5311985057468002E10 memory}, id = 5634 03-05 HashToRandomExchange(dist0=[[$0]]) : rowType = RecordType(ANY o_orderkey, ANY o_orderpriority): rowcount = 3.75E8, cumulative cost = {4.125E9 rows, 2.2125E10 cpu, 4.5E9 io, 3.072E12 network, 0.0 memory}, id = 5633 05-01 Project(o_orderkey=[$1], o_orderpriority=[$2]) : rowType = RecordType(ANY o_orderkey, ANY o_orderpriority): rowcount = 3.75E8, cumulative cost = {3.75E9 rows, 1.6125E10 cpu, 4.5E9 io, 0.0 network, 0.0 memory}, id = 5632 05-02 SelectionVectorRemover : rowType = RecordType(ANY o_orderdate, ANY o_orderkey, ANY o_orderpriority): rowcount = 3.75E8, cumulative cost = {3.375E9 rows, 1.5375E10 cpu, 4.5E9 io, 0.0 network, 0.0 memory}, id = 5631 05-03 Filter(condition=[AND(>=($0, 1996-10-01), <($0, 1997-01-01 00:00:00))]) : rowType = RecordType(ANY o_orderdate, ANY o_orderkey, ANY o_orderpriority): rowcount = 3.75E8, cumulative cost = {3.0E9 rows, 1.5E10 cpu, 4.5E9 io, 0.0 network, 0.0 memory}, id = 5630 05-04 Scan(table=[[dfs, tpchpar1000_micro, orders]], groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:///tpchParquet10/SF1000/orders]], selectionRoot=maprfs:/tpchParquet10/SF1000/orders, numFiles=1, numRowGroups=750, usedMetadataFile=false, columns=[`o_orderdate`, `o_orderkey`, `o_orderpriority`]]]) : rowType = RecordType(ANY o_orderdate, ANY o_orderkey, ANY o_orderpriority): rowcount = 1.5E9, cumulative cost = {1.5E9 rows, 4.5E9 cpu, 4.5E9 io, 0.0 network, 0.0 memory}, id = 5629 03-06 HashAgg(group=[{0}]) : rowType = RecordType(ANY l_orderkey): rowcount = 4499992.28175, cumulative cost = {1.3439976948159996E10 rows, 5.477990604317001E10 cpu, 1.7999969127E10 io, 1.8431968386048E11 network, 8.711985057468E9 memory}, id = 5628 03-07 HashToRandomExchange(dist0=[[$0]]) : rowType = RecordType(ANY l_orderkey): rowcount = 4.49999228175E7, cumulative cost = {1.3394977025342497E10 rows, 5.441990666063001E10 cpu, 1.7999969127E10 io, 1.8431968386048E11 network, 7.919986415880001E9 memory}, id = 5627 04-01 HashAgg(group=[{0}]) : rowType = RecordType(ANY l_orderkey): rowcount = 4.49999228175E7, cumulative cost = {1.3349977102524998E10 rows, 5.369990789555001E10 cpu, 1.7999969127E10 io, 0.0 network, 7.919986415880001E9 memory}, id = 5626 04-02 Project(l_orderkey=[$0], i=[true]) : rowType = RecordType(ANY l_orderkey, BOOLEAN i): rowcount = 4.49999228175E8, cumulative cost = {1.2899977874349998E10 rows, 5.009991407015001E10 cpu, 1.7999969127E10 io, 0.0 network, 0.0 memory}, id = 5625 04-03 SelectionVectorRemover : rowType = RecordType(ANY l_orderkey, ANY l_commitdate, ANY l_receiptdate): rowcount = 4.49999228175E8, cumulative cost = {1.2449978646175E10 rows, 4.784991792927501E10 cpu, 1.7999969127E10 io, 0.0 network, 0.0 memory}, id = 5624 04-04 Filter(condition=[AND(=($0, $0), <($1, $2))]) : rowType = RecordType(ANY l_orderkey, ANY l_commitdate, ANY l_receiptdate): rowcount = 4.49999228175E8, cumulative cost = {1.1999979418E10 rows, 4.7399918701100006E10 cpu, 1.7999969127E10 io, 0.0 network, 0.0 memory}, id = 5623 04-05 Scan(table=[[dfs, tpchpar1000_micro, lineitem]], groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:///tpchParquet10/SF1000/lineitem]], selectionRoot=maprfs:/tpchParquet10/SF1000/lineitem, numFiles=1, numRowGroups=3250, usedMetadataFile=false, columns=[`l_orderkey`, `l_commitdate`, `l_receiptdate`]]]) : rowType = RecordType(ANY l_orderkey, ANY l_commitdate, ANY l_receiptdate): rowcount = 5.999989709E9, cumulative cost = {5.999989709E9 rows, 1.7999969127E10 cpu, 1.7999969127E10 io, 0.0 network, 0.0 memory}, id = 5622
Here is the new plan:
00-00 Screen : rowType = RecordType(ANY o_orderpriority, BIGINT order_count): rowcount = 3750000.0, cumulative cost = {1.9163601940441746E10 rows, 9.07316867594483E10 cpu, 2.2499969127E10 io, 3.59423968386048E12 network, 2.2631985057468002E10 memory}, id = 11739 00-01 Project(o_orderpriority=[$0], order_count=[$1]) : rowType = RecordType(ANY o_orderpriority, BIGINT order_count): rowcount = 3750000.0, cumulative cost = {1.9163226940441746E10 rows, 9.07313117594483E10 cpu, 2.2499969127E10 io, 3.59423968386048E12 network, 2.2631985057468002E10 memory}, id = 11738 00-02 SingleMergeExchange(sort0=[0]) : rowType = RecordType(ANY o_orderpriority, BIGINT order_count): rowcount = 3750000.0, cumulative cost = {1.9159476940441746E10 rows, 9.07238117594483E10 cpu, 2.2499969127E10 io, 3.59423968386048E12 network, 2.2631985057468002E10 memory}, id = 11737 01-01 OrderedMuxExchange(sort0=[0]) : rowType = RecordType(ANY o_orderpriority, BIGINT order_count): rowcount = 3750000.0, cumulative cost = {1.9155726940441746E10 rows, 9.0643982838025E10 cpu, 2.2499969127E10 io, 3.56351968386048E12 network, 2.2631985057468002E10 memory}, id = 11736 02-01 SelectionVectorRemover : rowType = RecordType(ANY o_orderpriority, BIGINT order_count): rowcount = 3750000.0, cumulative cost = {1.9151976940441746E10 rows, 9.0640232838025E10 cpu, 2.2499969127E10 io, 3.56351968386048E12 network, 2.2631985057468002E10 memory}, id = 11735 02-02 Sort(sort0=[$0], dir0=[ASC]) : rowType = RecordType(ANY o_orderpriority, BIGINT order_count): rowcount = 3750000.0, cumulative cost = {1.9148226940441746E10 rows, 9.0636482838025E10 cpu, 2.2499969127E10 io, 3.56351968386048E12 network, 2.2631985057468002E10 memory}, id = 11734 02-03 HashAgg(group=[{0}], order_count=[$SUM0($1)]) : rowType = RecordType(ANY o_orderpriority, BIGINT order_count): rowcount = 3750000.0, cumulative cost = {1.9144476940441746E10 rows, 9.030890595055101E10 cpu, 2.2499969127E10 io, 3.56351968386048E12 network, 2.2571985057468002E10 memory}, id = 11733 02-04 HashToRandomExchange(dist0=[[$0]]) : rowType = RecordType(ANY o_orderpriority, BIGINT order_count): rowcount = 3.75E7, cumulative cost = {1.9106976940441746E10 rows, 8.955890595055101E10 cpu, 2.2499969127E10 io, 3.56351968386048E12 network, 2.1911985057468002E10 memory}, id = 11732 03-01 HashAgg(group=[{0}], order_count=[COUNT()]) : rowType = RecordType(ANY o_orderpriority, BIGINT order_count): rowcount = 3.75E7, cumulative cost = {1.9069476940441746E10 rows, 8.895890595055101E10 cpu, 2.2499969127E10 io, 3.25631968386048E12 network, 2.1911985057468002E10 memory}, id = 11731 03-02 Project(o_orderpriority=[$1]) : rowType = RecordType(ANY o_orderpriority): rowcount = 3.75E8, cumulative cost = {1.8694476940441746E10 rows, 8.145890595055101E10 cpu, 2.2499969127E10 io, 3.25631968386048E12 network, 1.5311985057468002E10 memory}, id = 11730 03-03 Project(o_orderkey=[$1], o_orderpriority=[$2], l_orderkey=[$0]) : rowType = RecordType(ANY o_orderkey, ANY o_orderpriority, ANY l_orderkey): rowcount = 3.75E8, cumulative cost = {1.8319476940441746E10 rows, 8.108390595055101E10 cpu, 2.2499969127E10 io, 3.25631968386048E12 network, 1.5311985057468002E10 memory}, id = 11729 03-04 HashJoin(condition=[=($1, $0)], joinType=[inner], semi-join: =[false]) : rowType = RecordType(ANY l_orderkey, ANY o_orderkey, ANY o_orderpriority): rowcount = 3.75E8, cumulative cost = {1.7944476940441746E10 rows, 7.995890595055101E10 cpu, 2.2499969127E10 io, 3.25631968386048E12 network, 1.5311985057468002E10 memory}, id = 11728 03-05 HashToRandomExchange(dist0=[[$0]]) : rowType = RecordType(ANY o_orderkey, ANY o_orderpriority): rowcount = 3.75E8, cumulative cost = {4.125E9 rows, 2.2125E10 cpu, 4.5E9 io, 3.072E12 network, 0.0 memory}, id = 11727 05-01 Project(o_orderkey=[$1], o_orderpriority=[$2]) : rowType = RecordType(ANY o_orderkey, ANY o_orderpriority): rowcount = 3.75E8, cumulative cost = {3.75E9 rows, 1.6125E10 cpu, 4.5E9 io, 0.0 network, 0.0 memory}, id = 11726 05-02 SelectionVectorRemover : rowType = RecordType(ANY o_orderdate, ANY o_orderkey, ANY o_orderpriority): rowcount = 3.75E8, cumulative cost = {3.375E9 rows, 1.5375E10 cpu, 4.5E9 io, 0.0 network, 0.0 memory}, id = 11725 05-03 Filter(condition=[AND(>=($0, 1996-10-01), <($0, 1997-01-01 00:00:00))]) : rowType = RecordType(ANY o_orderdate, ANY o_orderkey, ANY o_orderpriority): rowcount = 3.75E8, cumulative cost = {3.0E9 rows, 1.5E10 cpu, 4.5E9 io, 0.0 network, 0.0 memory}, id = 11724 05-04 Scan(table=[[dfs, tpchpar1000_micro, orders]], groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:///tpchParquet10/SF1000/orders]], selectionRoot=maprfs:/tpchParquet10/SF1000/orders, numFiles=1, numRowGroups=750, usedMetadataFile=false, columns=[`o_orderdate`, `o_orderkey`, `o_orderpriority`]]]) : rowType = RecordType(ANY o_orderdate, ANY o_orderkey, ANY o_orderpriority): rowcount = 1.5E9, cumulative cost = {1.5E9 rows, 4.5E9 cpu, 4.5E9 io, 0.0 network, 0.0 memory}, id = 11723 03-06 HashAgg(group=[{0}]) : rowType = RecordType(ANY l_orderkey): rowcount = 4499992.28175, cumulative cost = {1.3439976948159996E10 rows, 5.477990604317001E10 cpu, 1.7999969127E10 io, 1.8431968386048E11 network, 8.711985057468E9 memory}, id = 11722 03-07 HashToRandomExchange(dist0=[[$0]]) : rowType = RecordType(ANY l_orderkey): rowcount = 4.49999228175E7, cumulative cost = {1.3394977025342497E10 rows, 5.441990666063001E10 cpu, 1.7999969127E10 io, 1.8431968386048E11 network, 7.919986415880001E9 memory}, id = 11721 04-01 HashAgg(group=[{0}]) : rowType = RecordType(ANY l_orderkey): rowcount = 4.49999228175E7, cumulative cost = {1.3349977102524998E10 rows, 5.369990789555001E10 cpu, 1.7999969127E10 io, 0.0 network, 7.919986415880001E9 memory}, id = 11720 04-02 Project(l_orderkey=[$0], i=[true]) : rowType = RecordType(ANY l_orderkey, BOOLEAN i): rowcount = 4.49999228175E8, cumulative cost = {1.2899977874349998E10 rows, 5.009991407015001E10 cpu, 1.7999969127E10 io, 0.0 network, 0.0 memory}, id = 11719 04-03 SelectionVectorRemover : rowType = RecordType(ANY l_orderkey, ANY l_commitdate, ANY l_receiptdate): rowcount = 4.49999228175E8, cumulative cost = {1.2449978646175E10 rows, 4.784991792927501E10 cpu, 1.7999969127E10 io, 0.0 network, 0.0 memory}, id = 11718 04-04 Filter(condition=[AND(=($0, $0), <($1, $2))]) : rowType = RecordType(ANY l_orderkey, ANY l_commitdate, ANY l_receiptdate): rowcount = 4.49999228175E8, cumulative cost = {1.1999979418E10 rows, 4.7399918701100006E10 cpu, 1.7999969127E10 io, 0.0 network, 0.0 memory}, id = 11717 04-05 Scan(table=[[dfs, tpchpar1000_micro, lineitem]], groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:///tpchParquet10/SF1000/lineitem]], selectionRoot=maprfs:/tpchParquet10/SF1000/lineitem, numFiles=1, numRowGroups=3250, usedMetadataFile=false, columns=[`l_orderkey`, `l_commitdate`, `l_receiptdate`]]]) : rowType = RecordType(ANY l_orderkey, ANY l_commitdate, ANY l_receiptdate): rowcount = 5.999989709E9, cumulative cost = {5.999989709E9 rows, 1.7999969127E10 cpu, 1.7999969127E10 io, 0.0 network, 0.0 memory}, id = 11716
I have attached two profiles. 235a471b-aa97-bfb5-207d-3f25b4b5fbbb is from commit id 4627973bde9847a4eb2672c44941136c167326a1. This does not have Statistics code and serves as the baseline. It is the commit prior to the Statistics commit. 235a3ed4-e3d1-f3b7-39c5-fc947f56b6d5 is from commit id 5c436dbb028b813e80b7b8fcf045af31f0bcf68b. This has the Statistics code with the fix for disabled Statistics.
I also pulled the logs from the foreman to show the memory limit for Hash Agg. With the baseline, HashAgg has a limit of 10GB:
2019-04-04 00:57:27,446 [235a471b-aa97-bfb5-207d-3f25b4b5fbbb:frag:4:142] TRACE o.a.d.e.p.i.aggregate.HashAggregator - 1st phase. Estimated internal row width: 8 Values row width: 0 batch size: 720896 memory limit: 10000000000 max column width: 8
With statistics disabled, HashAgg has a limit of 64MB:
2019-04-04 01:32:48,132 [235a3ed4-e3d1-f3b7-39c5-fc947f56b6d5:frag:4:182] TRACE o.a.d.e.p.i.aggregate.HashAggregator - 1st phase. Estimated internal row width: 8 Values row width: 0 batch size: 720896 memory limit: 65075262 max column width: 8