STAGE DEPENDENCIES: Stage-1 is a root stage Stage-0 depends on stages: Stage-1 STAGE PLANS: Stage: Stage-1 Tez Edges: Map 4 <- Map 2 (BROADCAST_EDGE) Reducer 5 <- Map 1 (BROADCAST_EDGE), Map 3 (BROADCAST_EDGE), Map 4 (SIMPLE_EDGE), Map 7 (SIMPLE_EDGE), Map 8 (BROADCAST_EDGE) Reducer 6 <- Reducer 5 (SIMPLE_EDGE) DagName: yarn_20151217015656_51819384-6f3e-4a36-9c54-2cfcc5ed1e87:1 Vertices: Map 1 Map Operator Tree: TableScan alias: household_demographics Statistics: Num rows: 18956 Data size: 151653 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator sort order: Statistics: Num rows: 18956 Data size: 151653 Basic stats: COMPLETE Column stats: NONE value expressions: hd_demo_sk (type: int), hd_dep_count (type: int) Map 2 Map Operator Tree: TableScan alias: store Statistics: Num rows: 1472 Data size: 5889 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: s_store_sk is not null (type: boolean) Statistics: Num rows: 736 Data size: 2944 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: s_store_sk (type: int) sort order: + Map-reduce partition columns: s_store_sk (type: int) Statistics: Num rows: 736 Data size: 2944 Basic stats: COMPLETE Column stats: NONE Map 3 Map Operator Tree: TableScan alias: customer_address Statistics: Num rows: 38875 Data size: 7930592 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator sort order: Statistics: Num rows: 38875 Data size: 7930592 Basic stats: COMPLETE Column stats: NONE value expressions: ca_address_sk (type: int), ca_state (type: string), ca_country (type: string) Map 4 Map Operator Tree: TableScan alias: store_sales Statistics: Num rows: 19600662 Data size: 784026496 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: (ss_store_sk is not null and ss_sold_date_sk is not null) (type: boolean) Statistics: Num rows: 4900166 Data size: 196006644 Basic stats: COMPLETE Column stats: NONE Map Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {ss_sold_date_sk} {ss_cdemo_sk} {ss_hdemo_sk} {ss_addr_sk} {ss_store_sk} {ss_quantity} {ss_sales_price} {ss_ext_sales_price} {ss_ext_wholesale_cost} {ss_net_profit} 1 {s_store_sk} keys: 0 ss_store_sk (type: int) 1 s_store_sk (type: int) outputColumnNames: _col0, _col4, _col5, _col6, _col7, _col10, _col13, _col15, _col16, _col22, _col26 input vertices: 1 Map 2 Statistics: Num rows: 5390182 Data size: 215607313 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator sort order: Statistics: Num rows: 5390182 Data size: 215607313 Basic stats: COMPLETE Column stats: NONE value expressions: _col0 (type: int), _col4 (type: int), _col5 (type: int), _col6 (type: int), _col7 (type: int), _col10 (type: int), _col13 (type: float), _col15 (type: float), _col16 (type: float), _col22 (type: float), _col26 (type: int) Map 7 Map Operator Tree: TableScan alias: customer_demographics Statistics: Num rows: 395392 Data size: 80660096 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator sort order: Statistics: Num rows: 395392 Data size: 80660096 Basic stats: COMPLETE Column stats: NONE value expressions: cd_demo_sk (type: int), cd_marital_status (type: string), cd_education_status (type: string) Map 8 Map Operator Tree: TableScan alias: date_dim Statistics: Num rows: 1289679 Data size: 10317438 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: (d_date_sk is not null and (d_year = 2001)) (type: boolean) Statistics: Num rows: 322420 Data size: 2579361 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: d_date_sk (type: int) sort order: + Map-reduce partition columns: d_date_sk (type: int) Statistics: Num rows: 322420 Data size: 2579361 Basic stats: COMPLETE Column stats: NONE Reducer 5 Reduce Operator Tree: Merge Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {VALUE._col0} {VALUE._col4} {VALUE._col5} {VALUE._col6} {VALUE._col7} {VALUE._col10} {VALUE._col13} {VALUE._col15} {VALUE._col16} {VALUE._col22} {VALUE._col26} 1 {VALUE._col0} {VALUE._col2} {VALUE._col3} outputColumnNames: _col0, _col4, _col5, _col6, _col7, _col10, _col13, _col15, _col16, _col22, _col26, _col58, _col60, _col61 Statistics: Num rows: 5929200 Data size: 237168049 Basic stats: COMPLETE Column stats: NONE Map Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {_col0} {_col4} {_col5} {_col6} {_col7} {_col10} {_col13} {_col15} {_col16} {_col22} {_col26} {_col58} {_col60} {_col61} 1 {hd_demo_sk} {hd_dep_count} keys: 0 1 outputColumnNames: _col0, _col4, _col5, _col6, _col7, _col10, _col13, _col15, _col16, _col22, _col26, _col58, _col60, _col61, _col70, _col73 input vertices: 1 Map 1 Statistics: Num rows: 6522120 Data size: 260884859 Basic stats: COMPLETE Column stats: NONE Map Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {_col0} {_col4} {_col5} {_col6} {_col7} {_col10} {_col13} {_col15} {_col16} {_col22} {_col26} {_col58} {_col60} {_col61} {_col70} {_col73} 1 {ca_address_sk} {ca_state} {ca_country} keys: 0 1 outputColumnNames: _col0, _col4, _col5, _col6, _col7, _col10, _col13, _col15, _col16, _col22, _col26, _col58, _col60, _col61, _col70, _col73, _col78, _col86, _col88 input vertices: 1 Map 3 Statistics: Num rows: 7174332 Data size: 286973351 Basic stats: COMPLETE Column stats: NONE Map Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {_col0} {_col4} {_col5} {_col6} {_col7} {_col10} {_col13} {_col15} {_col16} {_col22} {_col26} {_col58} {_col60} {_col61} {_col70} {_col73} {_col78} {_col86} {_col88} 1 {d_date_sk} keys: 0 _col0 (type: int) 1 d_date_sk (type: int) outputColumnNames: _col0, _col4, _col5, _col6, _col7, _col10, _col13, _col15, _col16, _col22, _col26, _col58, _col60, _col61, _col70, _col73, _col78, _col86, _col88, _col94 input vertices: 1 Map 8 Statistics: Num rows: 7891765 Data size: 315670692 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: ((((_col26 = _col7) and (_col0 = _col94)) and ((((((((_col5 = _col70) and (_col58 = _col4)) and (_col60 = 'M')) and (_col61 = '4 yr Degree')) and _col13 BETWEEN 100.0 AND 150.0) and (_col73 = 3)) or ((((((_col5 = _col70) and (_col58 = _col4)) and (_col60 = 'D')) and (_col61 = 'Primary')) and _col13 BETWEEN 50.0 AND 100.0) and (_col73 = 1))) or ((((((_col5 = _col70) and (_col58 = _col4)) and (_col60 = 'U')) and (_col61 = 'Advanced Degree')) and _col13 BETWEEN 150.0 AND 200.0) and (_col73 = 1)))) and ((((((_col6 = _col78) and (_col88 = 'United States')) and (_col86) IN ('KY', 'GA', 'NM')) and _col22 BETWEEN 100 AND 200) or ((((_col6 = _col78) and (_col88 = 'United States')) and (_col86) IN ('MT', 'OR', 'IN')) and _col22 BETWEEN 150 AND 300)) or ((((_col6 = _col78) and (_col88 = 'United States')) and (_col86) IN ('WI', 'MO', 'WV')) and _col22 BETWEEN 50 AND 250))) (type: boolean) Statistics: Num rows: 17340 Data size: 693600 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: _col10 (type: int), _col15 (type: float), _col16 (type: float) outputColumnNames: _col10, _col15, _col16 Statistics: Num rows: 17340 Data size: 693600 Basic stats: COMPLETE Column stats: NONE Group By Operator aggregations: avg(_col10), avg(_col15), avg(_col16), sum(_col16) mode: hash outputColumnNames: _col0, _col1, _col2, _col3 Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator sort order: Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: NONE value expressions: _col0 (type: struct), _col1 (type: struct), _col2 (type: struct), _col3 (type: double) Reducer 6 Reduce Operator Tree: Group By Operator aggregations: avg(VALUE._col0), avg(VALUE._col1), avg(VALUE._col2), sum(VALUE._col3) mode: mergepartial outputColumnNames: _col0, _col1, _col2, _col3 Statistics: Num rows: 1 Data size: 32 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: _col0 (type: double), _col1 (type: double), _col2 (type: double), _col3 (type: double) outputColumnNames: _col0, _col1, _col2, _col3 Statistics: Num rows: 1 Data size: 32 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false Statistics: Num rows: 1 Data size: 32 Basic stats: COMPLETE Column stats: NONE table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe Stage: Stage-0 Fetch Operator limit: -1 Processor Tree: ListSink