Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
4.0.0
Description
The allAttributes method in QueryPlan (code) unions the output of all of its children. Although this is okay in an optimized plan, in a pre-optimized analyzed plan, these attributes add up multiplicatively with the size of the plan. Because the `output` is usually defined as a `def`, each node’s allAttributes also ends up with a distinct copy of each attribute, potentially causing significant memory pressure on the driver (especially under concurrency and with wide tables).
Here is a simple example with TPC-DS Q42. SQL:
select dt.d_year, item.i_category_id, item.i_category, sum(ss_ext_sales_price) from date_dim dt, store_sales, item where dt.d_date_sk = store_sales.ss_sold_date_sk and store_sales.ss_item_sk = item.i_item_sk and item.i_manager_id = 1 and dt.d_moy=11 and dt.d_year=2000 group by dt.d_year ,item.i_category_id ,item.i_category order by sum(ss_ext_sales_price) desc,dt.d_year ,item.i_category_id ,item.i_category limit 100
If we print out the size of each operator’s output and the size of its `allAttributes`:
GlobalLimit: allAttrs: 4, output: 4 LocalLimit: allAttrs: 4, output: 4 Sort: allAttrs: 4, output: 4 Aggregate: allAttrs: 73, output: 4 Filter: allAttrs: 73, output: 73 Join: allAttrs: 73, output: 73 Join: allAttrs: 51, output: 51 SubqueryAlias: allAttrs: 28, output: 28 SubqueryAlias: allAttrs: 28, output: 28 LogicalRelation: allAttrs: 0, output: 28 SubqueryAlias: allAttrs: 23, output: 23 LogicalRelation: allAttrs: 0, output: 23 SubqueryAlias: allAttrs: 22, output: 22 LogicalRelation: allAttrs: 0, output: 22
Note how the joins and aggregate have 73 attributes each, by adding the width of each relation. For queries with wide schemas, this issue is much worse. Optimized plans after column pruning look far better:
Aggregate: allAttrs: 0, output: 1 Project: allAttrs: 2, output: 0 Join: allAttrs: 2, output: 2 Project: allAttrs: 3, output: 1 Join: allAttrs: 3, output: 3 Project: allAttrs: 4, output: 2 Join: allAttrs: 4, output: 4 Project: allAttrs: 5, output: 3 Join: allAttrs: 5, output: 5 Project: allAttrs: 23, output: 4 Filter: allAttrs: 23, output: 23 LogicalRelation: allAttrs: 0, output: 23 Project: allAttrs: 29, output: 1 Filter: allAttrs: 29, output: 29 LogicalRelation: allAttrs: 0, output: 29 Project: allAttrs: 13, output: 1 Filter: allAttrs: 13, output: 13 LogicalRelation: allAttrs: 0, output: 13 Project: allAttrs: 22, output: 1 Filter: allAttrs: 22, output: 22 LogicalRelation: allAttrs: 0, output: 22 Project: allAttrs: 18, output: 1 Filter: allAttrs: 18, output: 18 LogicalRelation: allAttrs: 0, output: 18
Attachments
Attachments
Issue Links
- links to