explain SELECT state, cnt,dummy_agg FROM ( SELECT a.ca_state state, count(*) cnt, min (ss_coupon_amt) dummy_agg FROM customer_address a, customer c, store_sales s, date_dim d, item i WHERE a.ca_address_sk = c.c_current_addr_sk AND c.c_customer_sk = s.ss_customer_sk AND s.ss_sold_date_sk = d_date_sk AND s.ss_item_sk = i.i_item_sk AND d.d_month_seq = (SELECT distinct (d_month_seq) FROM date_dim WHERE d_year = 1998 AND d_moy = 1) AND i.i_current_price > 1.2 * (SELECT avg(j.i_current_price) FROM item j WHERE j.i_category = i.i_category) GROUP BY a.ca_state ) x WHERE cnt >= 10 ORDER BY cnt LIMIT 100 INFO : Starting task [Stage-3:EXPLAIN] in serial mode INFO : Completed executing command(queryId=hive_20230131061201_798c75e0-ac60-4dbc-b58c-e4a9b90e4ed1); Time taken: 0.066 seconds INFO : OK Explain STAGE DEPENDENCIES: Stage-1 is a root stage Stage-0 depends on stages: Stage-1 STAGE PLANS: Stage: Stage-1 Tez DagId: hive_20230131061201_798c75e0-ac60-4dbc-b58c-e4a9b90e4ed1:46 Edges: Map 1 <- Reducer 4 (BROADCAST_EDGE) Map 6 <- Reducer 5 (BROADCAST_EDGE) Map 7 <- Map 1 (BROADCAST_EDGE), Map 13 (BROADCAST_EDGE), Map 6 (BROADCAST_EDGE), Reducer 12 (BROADCAST_EDGE) Reducer 12 <- Map 10 (BROADCAST_EDGE), Map 11 (SIMPLE_EDGE) Reducer 3 <- Map 2 (SIMPLE_EDGE) Reducer 4 <- Reducer 3 (CUSTOM_SIMPLE_EDGE) Reducer 5 <- Map 2 (SIMPLE_EDGE) Reducer 8 <- Map 7 (SIMPLE_EDGE) Reducer 9 <- Reducer 8 (SIMPLE_EDGE) DagName: hive_20230131061201_798c75e0-ac60-4dbc-b58c-e4a9b90e4ed1:46 Vertices: Map 1 Map Operator Tree: TableScan alias: a filterExpr: ca_address_sk is not null (type: boolean) Statistics: Num rows: 6000000 Data size: 1071600188 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: ca_address_sk is not null (type: boolean) Statistics: Num rows: 5700000 Data size: 1018020178 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: ca_address_sk (type: int), ca_state (type: string) outputColumnNames: _col0, _col1 Statistics: Num rows: 5700000 Data size: 1018020178 Basic stats: COMPLETE Column stats: NONE Map Join Operator condition map: Inner Join 0 to 1 keys: 0 1 outputColumnNames: _col0, _col1 input vertices: 1 Reducer 4 Statistics: Num rows: 5700000 Data size: 1137720178 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: _col0 (type: int) null sort order: a sort order: + Map-reduce partition columns: _col0 (type: int) Statistics: Num rows: 5700000 Data size: 1137720178 Basic stats: COMPLETE Column stats: NONE value expressions: _col1 (type: string) Execution mode: vectorized, llap LLAP IO: all inputs (cache only) Map 10 Map Operator Tree: TableScan alias: i filterExpr: (i_current_price is not null and i_item_sk is not null and i_category is not null) (type: boolean) Statistics: Num rows: 300000 Data size: 85500300 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: (i_current_price is not null and i_item_sk is not null and i_category is not null) (type: boolean) Statistics: Num rows: 255000 Data size: 72675255 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: i_item_sk (type: int), i_current_price (type: decimal(7,2)), i_category (type: string) outputColumnNames: _col0, _col1, _col2 Statistics: Num rows: 255000 Data size: 72675255 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: _col2 (type: string) null sort order: a sort order: + Map-reduce partition columns: _col2 (type: string) Statistics: Num rows: 255000 Data size: 72675255 Basic stats: COMPLETE Column stats: NONE value expressions: _col0 (type: int), _col1 (type: decimal(7,2)) Execution mode: vectorized, llap LLAP IO: all inputs (cache only) Map 11 Map Operator Tree: TableScan alias: j filterExpr: i_category is not null (type: boolean) Statistics: Num rows: 300000 Data size: 84360296 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: i_category is not null (type: boolean) Statistics: Num rows: 285000 Data size: 80142281 Basic stats: COMPLETE Column stats: NONE Group By Operator aggregations: sum(i_current_price), count(i_current_price) keys: i_category (type: string) minReductionHashAggr: 0.99 mode: hash outputColumnNames: _col0, _col1, _col2 Statistics: Num rows: 285000 Data size: 80142281 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: _col0 (type: string) null sort order: a sort order: + Map-reduce partition columns: _col0 (type: string) Statistics: Num rows: 285000 Data size: 80142281 Basic stats: COMPLETE Column stats: NONE value expressions: _col1 (type: decimal(17,2)), _col2 (type: bigint) Execution mode: vectorized, llap LLAP IO: all inputs (cache only) Map 13 Map Operator Tree: TableScan alias: c filterExpr: (c_customer_sk is not null and c_current_addr_sk is not null) (type: boolean) Statistics: Num rows: 12000000 Data size: 91200008 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: (c_customer_sk is not null and c_current_addr_sk is not null) (type: boolean) Statistics: Num rows: 10800000 Data size: 82080007 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: c_customer_sk (type: int), c_current_addr_sk (type: int) outputColumnNames: _col0, _col1 Statistics: Num rows: 10800000 Data size: 82080007 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: _col0 (type: int) null sort order: a sort order: + Map-reduce partition columns: _col0 (type: int) Statistics: Num rows: 10800000 Data size: 82080007 Basic stats: COMPLETE Column stats: NONE value expressions: _col1 (type: int) Execution mode: vectorized, llap LLAP IO: all inputs (cache only) Map 2 Map Operator Tree: TableScan alias: date_dim filterExpr: (((d_year = 1998) and (d_moy = 1)) or ((d_year = 1998) and (d_moy = 1) and d_month_seq is not null)) (type: boolean) Statistics: Num rows: 73049 Data size: 832776 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: ((d_year = 1998) and (d_moy = 1)) (type: boolean) Statistics: Num rows: 1 Data size: 11 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: d_month_seq (type: int) outputColumnNames: d_month_seq Statistics: Num rows: 1 Data size: 11 Basic stats: COMPLETE Column stats: NONE Group By Operator keys: d_month_seq (type: int) minReductionHashAggr: 0.99 mode: hash outputColumnNames: _col0 Statistics: Num rows: 1 Data size: 11 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: _col0 (type: int) null sort order: a sort order: + Map-reduce partition columns: _col0 (type: int) Statistics: Num rows: 1 Data size: 11 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: ((d_year = 1998) and (d_moy = 1) and d_month_seq is not null) (type: boolean) Statistics: Num rows: 1 Data size: 11 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: d_month_seq (type: int) outputColumnNames: d_month_seq Statistics: Num rows: 1 Data size: 11 Basic stats: COMPLETE Column stats: NONE Group By Operator keys: d_month_seq (type: int) minReductionHashAggr: 0.99 mode: hash outputColumnNames: _col0 Statistics: Num rows: 1 Data size: 11 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: _col0 (type: int) null sort order: a sort order: + Map-reduce partition columns: _col0 (type: int) Statistics: Num rows: 1 Data size: 11 Basic stats: COMPLETE Column stats: NONE Execution mode: vectorized, llap LLAP IO: all inputs (cache only) Map 6 Map Operator Tree: TableScan alias: d filterExpr: (d_month_seq is not null and d_date_sk is not null) (type: boolean) Statistics: Num rows: 73049 Data size: 555184 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: (d_month_seq is not null and d_date_sk is not null) (type: boolean) Statistics: Num rows: 65745 Data size: 499672 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: d_date_sk (type: int), d_month_seq (type: int) outputColumnNames: _col0, _col1 Statistics: Num rows: 65745 Data size: 499672 Basic stats: COMPLETE Column stats: NONE Map Join Operator condition map: Inner Join 0 to 1 keys: 0 _col1 (type: int) 1 _col0 (type: int) outputColumnNames: _col0 input vertices: 1 Reducer 5 Statistics: Num rows: 72319 Data size: 549639 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: _col0 (type: int) null sort order: a sort order: + Map-reduce partition columns: _col0 (type: int) Statistics: Num rows: 72319 Data size: 549639 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: _col0 (type: int) outputColumnNames: _col0 Statistics: Num rows: 72319 Data size: 549639 Basic stats: COMPLETE Column stats: NONE Group By Operator keys: _col0 (type: int) minReductionHashAggr: 0.99 mode: hash outputColumnNames: _col0 Statistics: Num rows: 72319 Data size: 549639 Basic stats: COMPLETE Column stats: NONE Dynamic Partitioning Event Operator Target column: ss_sold_date_sk (int) Target Input: s Partition key expr: ss_sold_date_sk Statistics: Num rows: 72319 Data size: 549639 Basic stats: COMPLETE Column stats: NONE Target Vertex: Map 7 Execution mode: vectorized, llap LLAP IO: all inputs (cache only) Map 7 Map Operator Tree: TableScan alias: s filterExpr: (ss_item_sk is not null and ss_sold_date_sk is not null and ss_customer_sk is not null) (type: boolean) Statistics: Num rows: 2755519629 Data size: 324600212104 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: (ss_item_sk is not null and ss_sold_date_sk is not null and ss_customer_sk is not null) (type: boolean) Statistics: Num rows: 2342191677 Data size: 275910179387 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: ss_item_sk (type: int), ss_customer_sk (type: int), ss_coupon_amt (type: decimal(7,2)), ss_sold_date_sk (type: int) outputColumnNames: _col0, _col1, _col2, _col3 Statistics: Num rows: 2342191677 Data size: 275910179387 Basic stats: COMPLETE Column stats: NONE Map Join Operator condition map: Inner Join 0 to 1 keys: 0 _col0 (type: int) 1 _col3 (type: int) outputColumnNames: _col3, _col4, _col5 input vertices: 0 Map 6 Statistics: Num rows: 2576410900 Data size: 303501203903 Basic stats: COMPLETE Column stats: NONE Map Join Operator condition map: Inner Join 0 to 1 keys: 0 _col3 (type: int) 1 _col0 (type: int) outputColumnNames: _col4, _col5 input vertices: 1 Reducer 12 Statistics: Num rows: 2834052051 Data size: 333851331529 Basic stats: COMPLETE Column stats: NONE Map Join Operator condition map: Inner Join 0 to 1 keys: 0 _col4 (type: int) 1 _col0 (type: int) outputColumnNames: _col5, _col13 input vertices: 1 Map 13 Statistics: Num rows: 3117457323 Data size: 367236472641 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: _col13 (type: int), _col5 (type: decimal(7,2)) outputColumnNames: _col1, _col4 Statistics: Num rows: 3117457323 Data size: 367236472641 Basic stats: COMPLETE Column stats: NONE Map Join Operator condition map: Inner Join 0 to 1 keys: 0 _col0 (type: int) 1 _col1 (type: int) outputColumnNames: _col1, _col7 input vertices: 0 Map 1 Statistics: Num rows: 3429203129 Data size: 403960128660 Basic stats: COMPLETE Column stats: NONE Group By Operator aggregations: count(), min(_col7) keys: _col1 (type: string) minReductionHashAggr: 0.99 mode: hash outputColumnNames: _col0, _col1, _col2 Statistics: Num rows: 3429203129 Data size: 403960128660 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: _col0 (type: string) null sort order: a sort order: + Map-reduce partition columns: _col0 (type: string) Statistics: Num rows: 3429203129 Data size: 403960128660 Basic stats: COMPLETE Column stats: NONE value expressions: _col1 (type: bigint), _col2 (type: decimal(7,2)) Execution mode: vectorized, llap LLAP IO: all inputs (cache only) Reducer 12 Execution mode: vectorized, llap Reduce Operator Tree: Group By Operator aggregations: sum(VALUE._col0), count(VALUE._col1) keys: KEY._col0 (type: string) mode: mergepartial outputColumnNames: _col0, _col1, _col2 Statistics: Num rows: 142500 Data size: 40071140 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: CAST( CAST( (_col1 / _col2) AS decimal(11,6)) AS decimal(16,6)) is not null (type: boolean) Statistics: Num rows: 142500 Data size: 40071140 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: _col0 (type: string), (1.2 * CAST( CAST( (_col1 / _col2) AS decimal(11,6)) AS decimal(16,6))) (type: decimal(14,7)) outputColumnNames: _col0, _col1 Statistics: Num rows: 142500 Data size: 40071140 Basic stats: COMPLETE Column stats: NONE Map Join Operator condition map: Inner Join 0 to 1 keys: 0 _col2 (type: string) 1 _col0 (type: string) outputColumnNames: _col0, _col1, _col4 input vertices: 0 Map 10 Statistics: Num rows: 280500 Data size: 79942782 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: (_col1 > _col4) (type: boolean) Statistics: Num rows: 93500 Data size: 26647594 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: _col0 (type: int) outputColumnNames: _col0 Statistics: Num rows: 93500 Data size: 26647594 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: _col0 (type: int) null sort order: a sort order: + Map-reduce partition columns: _col0 (type: int) Statistics: Num rows: 93500 Data size: 26647594 Basic stats: COMPLETE Column stats: NONE Reducer 3 Execution mode: vectorized, llap Reduce Operator Tree: Group By Operator keys: KEY._col0 (type: int) mode: mergepartial outputColumnNames: _col0 Statistics: Num rows: 1 Data size: 11 Basic stats: COMPLETE Column stats: NONE Select Operator Statistics: Num rows: 1 Data size: 11 Basic stats: COMPLETE Column stats: NONE Group By Operator aggregations: count() minReductionHashAggr: 0.99 mode: hash outputColumnNames: _col0 Statistics: Num rows: 1 Data size: 20 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator null sort order: sort order: Statistics: Num rows: 1 Data size: 20 Basic stats: COMPLETE Column stats: NONE value expressions: _col0 (type: bigint) Reducer 4 Execution mode: vectorized, llap Reduce Operator Tree: Group By Operator aggregations: count(VALUE._col0) mode: mergepartial outputColumnNames: _col0 Statistics: Num rows: 1 Data size: 20 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: (sq_count_check(_col0) <= 1L) (type: boolean) Statistics: Num rows: 1 Data size: 20 Basic stats: COMPLETE Column stats: NONE Select Operator Statistics: Num rows: 1 Data size: 20 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator null sort order: sort order: Statistics: Num rows: 1 Data size: 20 Basic stats: COMPLETE Column stats: NONE Reducer 5 Execution mode: vectorized, llap Reduce Operator Tree: Group By Operator keys: KEY._col0 (type: int) mode: mergepartial outputColumnNames: _col0 Statistics: Num rows: 1 Data size: 11 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: _col0 (type: int) null sort order: a sort order: + Map-reduce partition columns: _col0 (type: int) Statistics: Num rows: 1 Data size: 11 Basic stats: COMPLETE Column stats: NONE Reducer 8 Execution mode: vectorized, llap Reduce Operator Tree: Group By Operator aggregations: count(VALUE._col0), min(VALUE._col1) keys: KEY._col0 (type: string) mode: mergepartial outputColumnNames: _col0, _col1, _col2 Statistics: Num rows: 1714601564 Data size: 201980064271 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: (_col1 >= 10L) (type: boolean) Statistics: Num rows: 571533854 Data size: 67326688011 Basic stats: COMPLETE Column stats: NONE Top N Key Operator sort order: + keys: _col1 (type: bigint) null sort order: z Statistics: Num rows: 571533854 Data size: 67326688011 Basic stats: COMPLETE Column stats: NONE top n: 100 Reduce Output Operator key expressions: _col1 (type: bigint) null sort order: z sort order: + Statistics: Num rows: 571533854 Data size: 67326688011 Basic stats: COMPLETE Column stats: NONE value expressions: _col0 (type: string), _col2 (type: decimal(7,2)) Reducer 9 Execution mode: vectorized, llap Reduce Operator Tree: Select Operator expressions: VALUE._col0 (type: string), KEY.reducesinkkey0 (type: bigint), VALUE._col1 (type: decimal(7,2)) outputColumnNames: _col0, _col1, _col2 Statistics: Num rows: 571533854 Data size: 67326688011 Basic stats: COMPLETE Column stats: NONE Limit Number of rows: 100 Statistics: Num rows: 100 Data size: 11700 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false Statistics: Num rows: 100 Data size: 11700 Basic stats: COMPLETE Column stats: NONE table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe Stage: Stage-0 Fetch Operator limit: 100 Processor Tree: ListSink