+--------------------------------------------------------------------------------------------------------------------------------------------------+--+ | STAGE DEPENDENCIES: | | Stage-2 is a root stage | | Stage-1 depends on stages: Stage-2 | | Stage-0 depends on stages: Stage-1 | | | | STAGE PLANS: | | Stage: Stage-2 | | Spark | | DagName: omm_20160723095042_5a24f0ed-b329-4853-a813-474a22f68a98:2 | | Vertices: | | Map 1 | | Map Operator Tree: | | TableScan | | alias: date_dim | | filterExpr: (((d_dow) IN (6, 0) and (d_year) IN (1998, 1999, 2000)) and d_date_sk is not null) (type: boolean) | | Statistics: Num rows: 73049 Data size: 81741831 Basic stats: COMPLETE Column stats: COMPLETE | | Filter Operator | | predicate: (((d_dow) IN (6, 0) and (d_year) IN (1998, 1999, 2000)) and d_date_sk is not null) (type: boolean) | | Statistics: Num rows: 18262 Data size: 219144 Basic stats: COMPLETE Column stats: COMPLETE | | Select Operator | | expressions: d_date_sk (type: int) | | outputColumnNames: _col0 | | Statistics: Num rows: 18262 Data size: 73048 Basic stats: COMPLETE Column stats: COMPLETE | | Spark HashTable Sink Operator | | keys: | | 0 ss_sold_date_sk (type: int) | | 1 _col0 (type: int) | | Local Work: | | Map Reduce Local Work | | Map 2 | | Map Operator Tree: | | TableScan | | alias: household_demographics | | filterExpr: (((hd_dep_count = 4) or (hd_vehicle_count = 2)) and hd_demo_sk is not null) (type: boolean) | | Statistics: Num rows: 7200 Data size: 770400 Basic stats: COMPLETE Column stats: COMPLETE | | Filter Operator | | predicate: (((hd_dep_count = 4) or (hd_vehicle_count = 2)) and hd_demo_sk is not null) (type: boolean) | | Statistics: Num rows: 1854 Data size: 22248 Basic stats: COMPLETE Column stats: COMPLETE | | Select Operator | | expressions: hd_demo_sk (type: int) | | outputColumnNames: _col0 | | Statistics: Num rows: 1854 Data size: 7416 Basic stats: COMPLETE Column stats: COMPLETE | | Spark HashTable Sink Operator | | keys: | | 0 _col4 (type: int) | | 1 _col0 (type: int) | | Local Work: | | Map Reduce Local Work | | Map 3 | | Map Operator Tree: | | TableScan | | alias: store | | filterExpr: ((s_city = 'Fairview') and s_store_sk is not null) (type: boolean) | | Statistics: Num rows: 16 Data size: 30796 Basic stats: COMPLETE Column stats: COMPLETE | | Filter Operator | | predicate: ((s_city = 'Fairview') and s_store_sk is not null) (type: boolean) | | Statistics: Num rows: 8 Data size: 752 Basic stats: COMPLETE Column stats: COMPLETE | | Select Operator | | expressions: s_store_sk (type: int) | | outputColumnNames: _col0 | | Statistics: Num rows: 8 Data size: 32 Basic stats: COMPLETE Column stats: COMPLETE | | Spark HashTable Sink Operator | | keys: | | 0 _col6 (type: int) | | 1 _col0 (type: int) | | Local Work: | | Map Reduce Local Work | | | | Stage: Stage-1 | | Spark | | Edges: | | Reducer 5 <- Map 4 (GROUP, 108) | | DagName: omm_20160723095042_5a24f0ed-b329-4853-a813-474a22f68a98:1 | | Vertices: | | Map 4 | | Map Operator Tree: | | TableScan | | alias: store_sales | | filterExpr: (ss_hdemo_sk is not null and ss_store_sk is not null) (type: boolean) | | Statistics: Num rows: 2816397978 Data size: 253800577128 Basic stats: COMPLETE Column stats: COMPLETE | | Filter Operator | | predicate: (ss_hdemo_sk is not null and ss_store_sk is not null) (type: boolean) | | Statistics: Num rows: 2557257550 Data size: 89651504960 Basic stats: COMPLETE Column stats: COMPLETE | | Map Join Operator | | condition map: | | Inner Join 0 to 1 | | keys: | | 0 ss_sold_date_sk (type: int) | | 1 _col0 (type: int) | | outputColumnNames: _col2, _col4, _col5, _col6, _col8, _col18, _col21 | | input vertices: | | 1 Map 1 | | Statistics: Num rows: 639305624 Data size: 20457779968 Basic stats: COMPLETE Column stats: COMPLETE | | Map Join Operator | | condition map: | | Inner Join 0 to 1 | | keys: | | 0 _col4 (type: int) | | 1 _col0 (type: int) | | outputColumnNames: _col2, _col5, _col6, _col8, _col18, _col21 | +--------------------------------------------------------------------------------------------------------------------------------------------------+--+ | Explain | +--------------------------------------------------------------------------------------------------------------------------------------------------+--+ | input vertices: | | 1 Map 2 | | Statistics: Num rows: 164621194 Data size: 4609393432 Basic stats: COMPLETE Column stats: COMPLETE | | Map Join Operator | | condition map: | | Inner Join 0 to 1 | | keys: | | 0 _col6 (type: int) | | 1 _col0 (type: int) | | outputColumnNames: _col2, _col5, _col8, _col18, _col21 | | input vertices: | | 1 Map 3 | | Statistics: Num rows: 82310597 Data size: 1975454328 Basic stats: COMPLETE Column stats: COMPLETE | | Select Operator | | expressions: _col8 (type: bigint), _col2 (type: int), _col5 (type: int), _col18 (type: float), _col21 (type: float) | | outputColumnNames: _col0, _col1, _col2, _col3, _col4 | | Statistics: Num rows: 82310597 Data size: 1975454328 Basic stats: COMPLETE Column stats: COMPLETE | | Group By Operator | | aggregations: sum(_col3), sum(_col4) | | keys: _col0 (type: bigint), _col1 (type: int), _col2 (type: int) | | mode: hash | | outputColumnNames: _col0, _col1, _col2, _col3, _col4 | | Statistics: Num rows: 82310597 Data size: 2633939104 Basic stats: COMPLETE Column stats: COMPLETE | | Reduce Output Operator | | key expressions: _col0 (type: bigint), _col1 (type: int), _col2 (type: int) | | sort order: +++ | | Map-reduce partition columns: _col0 (type: bigint), _col1 (type: int), _col2 (type: int) | | Statistics: Num rows: 82310597 Data size: 2633939104 Basic stats: COMPLETE Column stats: COMPLETE | | TopN Hash Memory Usage: 0.1 | | value expressions: _col3 (type: double), _col4 (type: double) | | Local Work: | | Map Reduce Local Work | | Reducer 5 | | Reduce Operator Tree: | | Group By Operator | | aggregations: sum(VALUE._col0), sum(VALUE._col1) | | keys: KEY._col0 (type: bigint), KEY._col1 (type: int), KEY._col2 (type: int) | | mode: mergepartial | | outputColumnNames: _col0, _col1, _col2, _col3, _col4 | | Statistics: Num rows: 82310597 Data size: 2633939104 Basic stats: COMPLETE Column stats: COMPLETE | | Select Operator | | expressions: _col0 (type: bigint), _col1 (type: int), _col3 (type: double), _col4 (type: double) | | outputColumnNames: _col0, _col1, _col2, _col3 | | Statistics: Num rows: 82310597 Data size: 2304696716 Basic stats: COMPLETE Column stats: COMPLETE | | Limit | | Number of rows: 100 | | Statistics: Num rows: 100 Data size: 2800 Basic stats: COMPLETE Column stats: COMPLETE | | File Output Operator | | compressed: false | | Statistics: Num rows: 100 Data size: 2800 Basic stats: COMPLETE Column stats: COMPLETE | | table: | | input format: org.apache.hadoop.mapred.TextInputFormat | | output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | | serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe | | | | Stage: Stage-0 | | Fetch Operator | | limit: 100 | | Processor Tree: | | ListSink | | | +--------------------------------------------------------------------------------------------------------------------------------------------------+--+