STAGE DEPENDENCIES: Stage-2 is a root stage Stage-1 depends on stages: Stage-2 Stage-0 depends on stages: Stage-1 STAGE PLANS: Stage: Stage-2 Spark DagName: root_20170630151444_56b28427-46cb-4141-beda-3e2b01e5e6ec:2 Vertices: Map 10 Map Operator Tree: TableScan alias: d2 filterExpr: (d_date_sk is not null and (d_quarter_name) IN ('2000Q1', '2000Q2', '2000Q3')) (type: boolean) Statistics: Num rows: 73049 Data size: 2045372 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: (d_date_sk is not null and (d_quarter_name) IN ('2000Q1', '2000Q2', '2000Q3')) (type: boolean) Statistics: Num rows: 36525 Data size: 1022699 Basic stats: COMPLETE Column stats: NONE Spark HashTable Sink Operator keys: 0 _col45 (type: bigint) 1 d_date_sk (type: bigint) Local Work: Map Reduce Local Work Map 11 Map Operator Tree: TableScan alias: d3 filterExpr: (d_date_sk is not null and (d_quarter_name) IN ('2000Q1', '2000Q2', '2000Q3')) (type: boolean) Statistics: Num rows: 73049 Data size: 2045372 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: (d_date_sk is not null and (d_quarter_name) IN ('2000Q1', '2000Q2', '2000Q3')) (type: boolean) Statistics: Num rows: 36525 Data size: 1022699 Basic stats: COMPLETE Column stats: NONE Spark HashTable Sink Operator keys: 0 _col82 (type: bigint) 1 d_date_sk (type: bigint) Local Work: Map Reduce Local Work Map 12 Map Operator Tree: TableScan alias: store filterExpr: s_store_sk is not null (type: boolean) Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE Filter Operator predicate: s_store_sk is not null (type: boolean) Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE Spark HashTable Sink Operator keys: 0 _col6 (type: bigint) 1 s_store_sk (type: bigint) Local Work: Map Reduce Local Work Map 13 Map Operator Tree: TableScan alias: item filterExpr: i_item_sk is not null (type: boolean) Statistics: Num rows: 360000 Data size: 7920000 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: i_item_sk is not null (type: boolean) Statistics: Num rows: 360000 Data size: 7920000 Basic stats: COMPLETE Column stats: NONE Spark HashTable Sink Operator keys: 0 _col1 (type: bigint) 1 i_item_sk (type: bigint) Local Work: Map Reduce Local Work Stage: Stage-1 Spark Edges: Reducer 2 <- Map 1 (PARTITION-LEVEL SORT, 1009), Map 7 (PARTITION-LEVEL SORT, 1009) Reducer 3 <- Map 8 (PARTITION-LEVEL SORT, 1), Reducer 2 (PARTITION-LEVEL SORT, 1) Reducer 4 <- Map 9 (PARTITION-LEVEL SORT, 1), Reducer 3 (PARTITION-LEVEL SORT, 1) Reducer 5 <- Reducer 4 (GROUP, 1009) Reducer 6 <- Reducer 5 (SORT, 1) DagName: root_20170630151444_56b28427-46cb-4141-beda-3e2b01e5e6ec:1 Vertices: Map 1 Map Operator Tree: TableScan alias: store_sales filterExpr: (ss_customer_sk is not null and ss_item_sk is not null and ss_ticket_number is not null and ss_store_sk is not null) (type: boolean) Statistics: Num rows: 8251124389 Data size: 247533731670 Basic stats: COMPLETE Column stats: PARTIAL Filter Operator predicate: (ss_customer_sk is not null and ss_item_sk is not null and ss_ticket_number is not null and ss_store_sk is not null) (type: boolean) Statistics: Num rows: 8251124389 Data size: 66008995112 Basic stats: COMPLETE Column stats: PARTIAL Reduce Output Operator key expressions: ss_customer_sk (type: bigint), ss_item_sk (type: bigint), ss_ticket_number (type: bigint) sort order: +++ Map-reduce partition columns: ss_customer_sk (type: bigint), ss_item_sk (type: bigint), ss_ticket_number (type: bigint) Statistics: Num rows: 8251124389 Data size: 66008995112 Basic stats: COMPLETE Column stats: PARTIAL value expressions: ss_store_sk (type: bigint), ss_quantity (type: int), ss_sold_date_sk (type: bigint) Map 7 Map Operator Tree: TableScan alias: store_returns filterExpr: (sr_customer_sk is not null and sr_item_sk is not null and sr_ticket_number is not null) (type: boolean) Statistics: Num rows: 833750016 Data size: 22511250432 Basic stats: COMPLETE Column stats: PARTIAL Filter Operator predicate: (sr_customer_sk is not null and sr_item_sk is not null and sr_ticket_number is not null) (type: boolean) Statistics: Num rows: 833750016 Data size: 6670000128 Basic stats: COMPLETE Column stats: PARTIAL Reduce Output Operator key expressions: sr_customer_sk (type: bigint), sr_item_sk (type: bigint), sr_ticket_number (type: bigint) sort order: +++ Map-reduce partition columns: sr_customer_sk (type: bigint), sr_item_sk (type: bigint), sr_ticket_number (type: bigint) Statistics: Num rows: 833750016 Data size: 6670000128 Basic stats: COMPLETE Column stats: PARTIAL value expressions: sr_return_quantity (type: int), sr_returned_date_sk (type: bigint) Map 8 Map Operator Tree: TableScan alias: catalog_sales filterExpr: (cs_bill_customer_sk is not null and cs_item_sk is not null) (type: boolean) Statistics: Num rows: 4298474285 Data size: 176237445685 Basic stats: COMPLETE Column stats: PARTIAL Filter Operator predicate: (cs_bill_customer_sk is not null and cs_item_sk is not null) (type: boolean) Statistics: Num rows: 4298474285 Data size: 34387794280 Basic stats: COMPLETE Column stats: PARTIAL Reduce Output Operator key expressions: cs_bill_customer_sk (type: bigint), cs_item_sk (type: bigint) sort order: ++ Map-reduce partition columns: cs_bill_customer_sk (type: bigint), cs_item_sk (type: bigint) Statistics: Num rows: 4298474285 Data size: 34387794280 Basic stats: COMPLETE Column stats: PARTIAL value expressions: cs_quantity (type: int), cs_sold_date_sk (type: bigint) Map 9 Map Operator Tree: TableScan alias: d1 filterExpr: (d_date_sk is not null and (d_quarter_name = '2000Q1')) (type: boolean) Statistics: Num rows: 73049 Data size: 2045372 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: (d_date_sk is not null and (d_quarter_name = '2000Q1')) (type: boolean) Statistics: Num rows: 36524 Data size: 1022672 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: d_date_sk (type: bigint) sort order: + Map-reduce partition columns: d_date_sk (type: bigint) Statistics: Num rows: 36524 Data size: 1022672 Basic stats: COMPLETE Column stats: NONE Reducer 2 Reduce Operator Tree: Join Operator condition map: Inner Join 0 to 1 keys: 0 ss_customer_sk (type: bigint), ss_item_sk (type: bigint), ss_ticket_number (type: bigint) 1 sr_customer_sk (type: bigint), sr_item_sk (type: bigint), sr_ticket_number (type: bigint) outputColumnNames: _col1, _col2, _col6, _col8, _col9, _col22, _col27, _col28, _col34, _col35, _col45 Statistics: Num rows: 3439687545673370112 Data size: 9223372036854775807 Basic stats: COMPLETE Column stats: PARTIAL Reduce Output Operator key expressions: _col28 (type: bigint), _col27 (type: bigint) sort order: ++ Map-reduce partition columns: _col28 (type: bigint), _col27 (type: bigint) Statistics: Num rows: 3439687545673370112 Data size: 9223372036854775807 Basic stats: COMPLETE Column stats: PARTIAL value expressions: _col1 (type: bigint), _col2 (type: bigint), _col6 (type: bigint), _col8 (type: bigint), _col9 (type: int), _col22 (type: bigint), _col34 (type: bigint), _col35 (type: int), _col45 (type: bigint) Reducer 3 Reduce Operator Tree: Join Operator condition map: Inner Join 0 to 1 keys: 0 _col28 (type: bigint), _col27 (type: bigint) 1 cs_bill_customer_sk (type: bigint), cs_item_sk (type: bigint) outputColumnNames: _col1, _col2, _col6, _col8, _col9, _col22, _col27, _col28, _col34, _col35, _col45, _col51, _col63, _col66, _col82 Statistics: Num rows: 9223372036854775807 Data size: 9223372036854775807 Basic stats: COMPLETE Column stats: PARTIAL Reduce Output Operator key expressions: _col22 (type: bigint) sort order: + Map-reduce partition columns: _col22 (type: bigint) Statistics: Num rows: 9223372036854775807 Data size: 9223372036854775807 Basic stats: COMPLETE Column stats: PARTIAL value expressions: _col1 (type: bigint), _col2 (type: bigint), _col6 (type: bigint), _col8 (type: bigint), _col9 (type: int), _col27 (type: bigint), _col28 (type: bigint), _col34 (type: bigint), _col35 (type: int), _col45 (type: bigint), _col51 (type: bigint), _col63 (type: bigint), _col66 (type: int), _col82 (type: bigint) Reducer 4 Local Work: Map Reduce Local Work Reduce Operator Tree: Join Operator condition map: Inner Join 0 to 1 keys: 0 _col22 (type: bigint) 1 d_date_sk (type: bigint) outputColumnNames: _col1, _col2, _col6, _col8, _col9, _col22, _col27, _col28, _col34, _col35, _col45, _col51, _col63, _col66, _col82, _col86 Statistics: Num rows: 9223372036854775807 Data size: 9223372036854775807 Basic stats: COMPLETE Column stats: NONE Map Join Operator condition map: Inner Join 0 to 1 keys: 0 _col45 (type: bigint) 1 d_date_sk (type: bigint) outputColumnNames: _col1, _col2, _col6, _col8, _col9, _col22, _col27, _col28, _col34, _col35, _col45, _col51, _col63, _col66, _col82, _col86, _col117 input vertices: 1 Map 10 Statistics: Num rows: 9223372036854775807 Data size: 9223372036854775807 Basic stats: COMPLETE Column stats: NONE Map Join Operator condition map: Inner Join 0 to 1 keys: 0 _col82 (type: bigint) 1 d_date_sk (type: bigint) outputColumnNames: _col1, _col2, _col6, _col8, _col9, _col22, _col27, _col28, _col34, _col35, _col45, _col51, _col63, _col66, _col82, _col86, _col117, _col148 input vertices: 1 Map 11 Statistics: Num rows: 9223372036854775807 Data size: 9223372036854775807 Basic stats: COMPLETE Column stats: NONE Map Join Operator condition map: Inner Join 0 to 1 keys: 0 _col6 (type: bigint) 1 s_store_sk (type: bigint) outputColumnNames: _col1, _col2, _col6, _col8, _col9, _col22, _col27, _col28, _col34, _col35, _col45, _col51, _col63, _col66, _col82, _col86, _col117, _col148, _col179, _col203 input vertices: 1 Map 12 Statistics: Num rows: 9223372036854775807 Data size: 9223372036854775807 Basic stats: COMPLETE Column stats: NONE Map Join Operator condition map: Inner Join 0 to 1 keys: 0 _col1 (type: bigint) 1 i_item_sk (type: bigint) outputColumnNames: _col1, _col2, _col6, _col8, _col9, _col22, _col27, _col28, _col34, _col35, _col45, _col51, _col63, _col66, _col82, _col86, _col117, _col148, _col179, _col203, _col211, _col212, _col215 input vertices: 1 Map 13 Statistics: Num rows: 9223372036854775807 Data size: 9223372036854775807 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: ((_col86 = _col22) and (_col82 = _col148) and (_col179 = _col6) and (_col2 = _col28) and (_col1 = _col27) and (_col8 = _col34) and (_col211 = _col1) and (_col45 = _col117) and (_col28 = _col51) and (_col27 = _col63)) (type: boolean) Statistics: Num rows: 9007199254740991 Data size: 9007199254740991 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: _col212 (type: string), _col215 (type: string), _col203 (type: string), _col9 (type: int), _col35 (type: int), _col66 (type: int) outputColumnNames: _col212, _col215, _col203, _col9, _col35, _col66 Statistics: Num rows: 9007199254740991 Data size: 9007199254740991 Basic stats: COMPLETE Column stats: NONE Group By Operator aggregations: count(_col9), avg(_col9), stddev_samp(_col9), count(_col35), avg(_col35), stddev_samp(_col35), count(_col66), avg(_col66), stddev_samp(_col66) keys: _col212 (type: string), _col215 (type: string), _col203 (type: string) mode: hash outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11 Statistics: Num rows: 9007199254740991 Data size: 9007199254740991 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: _col0 (type: string), _col1 (type: string), _col2 (type: string) sort order: +++ Map-reduce partition columns: _col0 (type: string), _col1 (type: string), _col2 (type: string) Statistics: Num rows: 9007199254740991 Data size: 9007199254740991 Basic stats: COMPLETE Column stats: NONE TopN Hash Memory Usage: 0.04 value expressions: _col3 (type: bigint), _col4 (type: struct), _col5 (type: struct), _col6 (type: bigint), _col7 (type: struct), _col8 (type: struct), _col9 (type: bigint), _col10 (type: struct), _col11 (type: struct) Reducer 5 Reduce Operator Tree: Group By Operator aggregations: count(VALUE._col0), avg(VALUE._col1), stddev_samp(VALUE._col2), count(VALUE._col3), avg(VALUE._col4), stddev_samp(VALUE._col5), count(VALUE._col6), avg(VALUE._col7), stddev_samp(VALUE._col8) keys: KEY._col0 (type: string), KEY._col1 (type: string), KEY._col2 (type: string) mode: mergepartial outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11 Statistics: Num rows: 4503599627370495 Data size: 4503599627370495 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: _col0 (type: string), _col1 (type: string), (_col8 / _col7) (type: double), _col9 (type: bigint), _col10 (type: double), (_col11 / _col10) (type: double), _col2 (type: string), _col3 (type: bigint), _col4 (type: double), _col5 (type: double), (_col5 / _col4) (type: double), _col6 (type: bigint), _col7 (type: double), _col8 (type: double) outputColumnNames: _col0, _col1, _col10, _col11, _col12, _col13, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9 Statistics: Num rows: 4503599627370495 Data size: 4503599627370495 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: _col0 (type: string), _col1 (type: string), _col2 (type: string) sort order: +++ Statistics: Num rows: 4503599627370495 Data size: 4503599627370495 Basic stats: COMPLETE Column stats: NONE TopN Hash Memory Usage: 0.04 value expressions: _col3 (type: bigint), _col4 (type: double), _col5 (type: double), _col6 (type: double), _col7 (type: bigint), _col8 (type: double), _col9 (type: double), _col10 (type: double), _col11 (type: bigint), _col12 (type: double), _col13 (type: double) Reducer 6 Execution mode: vectorized Reduce Operator Tree: Select Operator expressions: KEY.reducesinkkey0 (type: string), KEY.reducesinkkey1 (type: string), KEY.reducesinkkey2 (type: string), VALUE._col0 (type: bigint), VALUE._col1 (type: double), VALUE._col2 (type: double), VALUE._col3 (type: double), VALUE._col4 (type: bigint), VALUE._col5 (type: double), VALUE._col6 (type: double), VALUE._col7 (type: double), VALUE._col8 (type: bigint), VALUE._col9 (type: double), VALUE._col10 (type: double), VALUE._col10 (type: double) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12, _col13, _col14 Statistics: Num rows: 4503599627370495 Data size: 4503599627370495 Basic stats: COMPLETE Column stats: NONE Limit Number of rows: 100 Statistics: Num rows: 100 Data size: 100 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false Statistics: Num rows: 100 Data size: 100 Basic stats: COMPLETE Column stats: NONE table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe Stage: Stage-0 Fetch Operator limit: 100 Processor Tree: ListSink Time taken: 50.212 seconds, Fetched: 286 row(s) hive>