use tpcds_bin_partitioned_orc_1000 set tez.runtime.io.sort.mb=32 set tez.unordered.output.max-per-buffer.size-bytes=33554432 set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DummyTxnManager set hive.support.concurrency=false set hive.exec.pre.hooks= set hive.exec.post.hooks= set tez.dag.recovery.enabled=false set hive.stats.fetch.column.stats=false set tez.task.get-task.sleep.interval-ms.max=1 set hive.mapjoin.hybridgrace.hashtable=true set tez.history.logging.service.class=org.apache.tez.dag.history.logging.impl.SimpleHistoryLoggingService set hive.exec.reducers.max=32 set hive.cbo.enable=false set hive.tez.exec.print.summary=true set hive.optimize.dynamic.partition.hashjoin=true set hive.vectorized.execution.reduce.enabled=true set hive.vectorized.execution.reduce.enabled=true set hive.explain.user=false explain -- logical select i_item_id ,i_item_desc ,i_current_price from item, inventory, date_dim, store_sales where i_current_price between 30 and 30+30 and inv_item_sk = i_item_sk and d_date_sk=inv_date_sk and d_date between '2002-05-30' and '2002-07-30' and i_manufact_id in (437,129,727,663) and inv_quantity_on_hand between 100 and 500 and ss_item_sk = i_item_sk group by i_item_id,i_item_desc,i_current_price order by i_item_id limit 100 STAGE DEPENDENCIES: Stage-1 is a root stage Stage-0 depends on stages: Stage-1 STAGE PLANS: Stage: Stage-1 Tez Edges: Reducer 5 <- Map 1 (CUSTOM_SIMPLE_EDGE), Map 2 (CUSTOM_SIMPLE_EDGE), Map 3 (BROADCAST_EDGE), Map 4 (CUSTOM_SIMPLE_EDGE) Reducer 6 <- Reducer 5 (SIMPLE_EDGE) Reducer 7 <- Reducer 6 (SIMPLE_EDGE) DagName: gopal_20151028171141_eadc3461-c27d-4057-aef2-1d4b86bfe2cf:1 Vertices: Map 1 Map Operator Tree: TableScan alias: item filterExpr: ((i_item_sk is not null and i_current_price BETWEEN 30 AND 60) and (i_manufact_id) IN (437, 129, 727, 663)) (type: boolean) Statistics: Num rows: 300000 Data size: 430987081 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: ((i_item_sk is not null and i_current_price BETWEEN 30 AND 60) and (i_manufact_id) IN (437, 129, 727, 663)) (type: boolean) Statistics: Num rows: 37500 Data size: 53873385 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: i_item_sk (type: int) sort order: + Map-reduce partition columns: i_item_sk (type: int) Statistics: Num rows: 37500 Data size: 53873385 Basic stats: COMPLETE Column stats: NONE value expressions: i_item_id (type: string), i_item_desc (type: string), i_current_price (type: float) Execution mode: vectorized, llap LLAP IO: all inputs Map 2 Map Operator Tree: TableScan alias: inventory filterExpr: (inv_item_sk is not null and inv_quantity_on_hand BETWEEN 100 AND 500) (type: boolean) Statistics: Num rows: 783000000 Data size: 9239384968 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: (inv_item_sk is not null and inv_quantity_on_hand BETWEEN 100 AND 500) (type: boolean) Statistics: Num rows: 195750000 Data size: 2309846242 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: inv_item_sk (type: int) sort order: + Map-reduce partition columns: inv_item_sk (type: int) Statistics: Num rows: 195750000 Data size: 2309846242 Basic stats: COMPLETE Column stats: NONE value expressions: inv_date_sk (type: int) Execution mode: vectorized, llap LLAP IO: all inputs Map 3 Map Operator Tree: TableScan alias: date_dim filterExpr: (d_date_sk is not null and d_date BETWEEN '2002-05-30' AND '2002-07-30') (type: boolean) Statistics: Num rows: 73049 Data size: 81741831 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: (d_date_sk is not null and d_date BETWEEN '2002-05-30' AND '2002-07-30') (type: boolean) Statistics: Num rows: 18262 Data size: 20435178 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: d_date_sk (type: int) sort order: + Map-reduce partition columns: d_date_sk (type: int) Statistics: Num rows: 18262 Data size: 20435178 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: d_date_sk (type: int) outputColumnNames: _col0 Statistics: Num rows: 18262 Data size: 20435178 Basic stats: COMPLETE Column stats: NONE Group By Operator keys: _col0 (type: int) mode: hash outputColumnNames: _col0 Statistics: Num rows: 18262 Data size: 20435178 Basic stats: COMPLETE Column stats: NONE Dynamic Partitioning Event Operator Target Input: inventory Partition key expr: inv_date_sk Statistics: Num rows: 18262 Data size: 20435178 Basic stats: COMPLETE Column stats: NONE Target column: inv_date_sk Target Vertex: Map 2 Execution mode: vectorized, llap LLAP IO: all inputs Map 4 Map Operator Tree: TableScan alias: store_sales filterExpr: ss_item_sk is not null (type: boolean) Statistics: Num rows: 2879987999 Data size: 254591394436 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: ss_item_sk is not null (type: boolean) Statistics: Num rows: 1439994000 Data size: 127295697262 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: ss_item_sk (type: int) sort order: + Map-reduce partition columns: ss_item_sk (type: int) Statistics: Num rows: 1439994000 Data size: 127295697262 Basic stats: COMPLETE Column stats: NONE Execution mode: vectorized, llap LLAP IO: all inputs Reducer 5 Execution mode: vectorized, llap Reduce Operator Tree: Map Join Operator condition map: Inner Join 0 to 1 Inner Join 0 to 2 keys: 0 KEY.reducesinkkey0 (type: int) 1 KEY.reducesinkkey0 (type: int) 2 KEY.reducesinkkey0 (type: int) outputColumnNames: _col0, _col1, _col4, _col5, _col25, _col28, _col33 input vertices: 0 Map 1 1 Map 2 Statistics: Num rows: 3167986868 Data size: 280050540046 Basic stats: COMPLETE Column stats: NONE CustomEdgeJoin: true HybridGraceHashJoin: true Map Join Operator condition map: Inner Join 0 to 1 keys: 0 _col28 (type: int) 1 d_date_sk (type: int) outputColumnNames: _col0, _col1, _col4, _col5, _col25, _col28, _col33, _col58 input vertices: 1 Map 3 Statistics: Num rows: 3484785630 Data size: 308055600727 Basic stats: COMPLETE Column stats: NONE HybridGraceHashJoin: true Filter Operator predicate: (((_col58 = _col28) and (_col33 = _col0)) and (_col25 = _col0)) (type: boolean) Statistics: Num rows: 435598203 Data size: 38506950024 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: _col1 (type: string), _col4 (type: string), _col5 (type: float) outputColumnNames: _col1, _col4, _col5 Statistics: Num rows: 435598203 Data size: 38506950024 Basic stats: COMPLETE Column stats: NONE Group By Operator keys: _col1 (type: string), _col4 (type: string), _col5 (type: float) mode: hash outputColumnNames: _col0, _col1, _col2 Statistics: Num rows: 435598203 Data size: 38506950024 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: _col0 (type: string), _col1 (type: string), _col2 (type: float) sort order: +++ Map-reduce partition columns: _col0 (type: string), _col1 (type: string), _col2 (type: float) Statistics: Num rows: 435598203 Data size: 38506950024 Basic stats: COMPLETE Column stats: NONE Reducer 6 Execution mode: vectorized, llap Reduce Operator Tree: Group By Operator keys: KEY._col0 (type: string), KEY._col1 (type: string), KEY._col2 (type: float) mode: mergepartial outputColumnNames: _col0, _col1, _col2 Statistics: Num rows: 217799101 Data size: 19253474967 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: _col0 (type: string) sort order: + Statistics: Num rows: 217799101 Data size: 19253474967 Basic stats: COMPLETE Column stats: NONE TopN Hash Memory Usage: 0.04 value expressions: _col1 (type: string), _col2 (type: float) Reducer 7 Execution mode: vectorized, llap Reduce Operator Tree: Select Operator expressions: KEY.reducesinkkey0 (type: string), VALUE._col0 (type: string), VALUE._col1 (type: float) outputColumnNames: _col0, _col1, _col2 Statistics: Num rows: 217799101 Data size: 19253474967 Basic stats: COMPLETE Column stats: NONE Limit Number of rows: 100 Statistics: Num rows: 100 Data size: 8800 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false Statistics: Num rows: 100 Data size: 8800 Basic stats: COMPLETE Column stats: NONE table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe Stage: Stage-0 Fetch Operator limit: 100 Processor Tree: ListSink