STAGE DEPENDENCIES: Stage-4 is a root stage Stage-3 depends on stages: Stage-4 Stage-2 depends on stages: Stage-3 Stage-1 depends on stages: Stage-2 Stage-0 depends on stages: Stage-1 STAGE PLANS: Stage: Stage-4 Spark Edges: Reducer 2 <- Map 1 (GROUP PARTITION-LEVEL SORT, 1) Reducer 6 <- Map 5 (GROUP PARTITION-LEVEL SORT, 1) DagName: ec2-user_20141216112222_ede6d4d5-abed-4f59-a7de-ab18f8f69ce1:4 Vertices: Map 1 Map Operator Tree: TableScan alias: store_sales filterExpr: (ss_quantity BETWEEN 0 AND 5 and ((ss_list_price BETWEEN 11 AND 21 or ss_coupon_amt BETWEEN 460 AND 1460) or ss_wholesale_cost BETWEEN 14 AND 34)) (type: boolean) Statistics: Num rows: 1100149984 Data size: 47174678850 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: (ss_quantity BETWEEN 0 AND 5 and ((ss_list_price BETWEEN 11 AND 21 or ss_coupon_amt BETWEEN 460 AND 1460) or ss_wholesale_cost BETWEEN 14 AND 34)) (type: boolean) Statistics: Num rows: 825112488 Data size: 10929217832 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: ss_list_price (type: float) outputColumnNames: ss_list_price Statistics: Num rows: 825112488 Data size: 10929217832 Basic stats: COMPLETE Column stats: COMPLETE Group By Operator aggregations: avg(ss_list_price), count(ss_list_price), count(DISTINCT ss_list_price) keys: ss_list_price (type: float) mode: hash outputColumnNames: _col0, _col1, _col2, _col3 Statistics: Num rows: 2366705 Data size: 45704632 Basic stats: COMPLETE Column stats: COMPLETE Reduce Output Operator key expressions: _col0 (type: float) sort order: + Statistics: Num rows: 2366705 Data size: 45704632 Basic stats: COMPLETE Column stats: COMPLETE value expressions: _col1 (type: struct), _col2 (type: bigint) Map 5 Map Operator Tree: TableScan alias: store_sales filterExpr: (ss_quantity BETWEEN 11 AND 15 and ((ss_list_price BETWEEN 66 AND 76 or ss_coupon_amt BETWEEN 920 AND 1920) or ss_wholesale_cost BETWEEN 4 AND 24)) (type: boolean) Statistics: Num rows: 1100149984 Data size: 47174678850 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: (ss_quantity BETWEEN 11 AND 15 and ((ss_list_price BETWEEN 66 AND 76 or ss_coupon_amt BETWEEN 920 AND 1920) or ss_wholesale_cost BETWEEN 4 AND 24)) (type: boolean) Statistics: Num rows: 825112488 Data size: 10929217832 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: ss_list_price (type: float) outputColumnNames: ss_list_price Statistics: Num rows: 825112488 Data size: 10929217832 Basic stats: COMPLETE Column stats: COMPLETE Group By Operator aggregations: avg(ss_list_price), count(ss_list_price), count(DISTINCT ss_list_price) keys: ss_list_price (type: float) mode: hash outputColumnNames: _col0, _col1, _col2, _col3 Statistics: Num rows: 2366705 Data size: 45704632 Basic stats: COMPLETE Column stats: COMPLETE Reduce Output Operator key expressions: _col0 (type: float) sort order: + Statistics: Num rows: 2366705 Data size: 45704632 Basic stats: COMPLETE Column stats: COMPLETE value expressions: _col1 (type: struct), _col2 (type: bigint) Reducer 2 Local Work: Map Reduce Local Work Reduce Operator Tree: Group By Operator aggregations: avg(VALUE._col0), count(VALUE._col1), count(DISTINCT KEY._col0:0._col0) mode: mergepartial outputColumnNames: _col0, _col1, _col2 Statistics: Num rows: 1 Data size: 24 Basic stats: COMPLETE Column stats: COMPLETE Spark HashTable Sink Operator condition expressions: 0 {_col0} {_col1} {_col2} 1 {_col0} {_col1} {_col2} keys: 0 1 Reducer 6 Local Work: Map Reduce Local Work Reduce Operator Tree: Group By Operator aggregations: avg(VALUE._col0), count(VALUE._col1), count(DISTINCT KEY._col0:0._col0) mode: mergepartial outputColumnNames: _col0, _col1, _col2 Statistics: Num rows: 1 Data size: 24 Basic stats: COMPLETE Column stats: COMPLETE Spark HashTable Sink Operator condition expressions: 0 {_col0} {_col1} {_col2} {_col3} {_col4} {_col5} 1 {_col0} {_col1} {_col2} keys: 0 1 Stage: Stage-3 Spark Edges: Reducer 4 <- Map 3 (GROUP PARTITION-LEVEL SORT, 1) DagName: ec2-user_20141216112222_ede6d4d5-abed-4f59-a7de-ab18f8f69ce1:3 Vertices: Map 3 Map Operator Tree: TableScan alias: store_sales filterExpr: (ss_quantity BETWEEN 6 AND 10 and ((ss_list_price BETWEEN 91 AND 101 or ss_coupon_amt BETWEEN 1430 AND 2430) or ss_wholesale_cost BETWEEN 32 AND 52)) (type: boolean) Statistics: Num rows: 1100149984 Data size: 47174678850 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: (ss_quantity BETWEEN 6 AND 10 and ((ss_list_price BETWEEN 91 AND 101 or ss_coupon_amt BETWEEN 1430 AND 2430) or ss_wholesale_cost BETWEEN 32 AND 52)) (type: boolean) Statistics: Num rows: 825112488 Data size: 35381009137 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: ss_list_price (type: float) outputColumnNames: ss_list_price Statistics: Num rows: 825112488 Data size: 35381009137 Basic stats: COMPLETE Column stats: NONE Group By Operator aggregations: avg(ss_list_price), count(ss_list_price), count(DISTINCT ss_list_price) keys: ss_list_price (type: float) mode: hash outputColumnNames: _col0, _col1, _col2, _col3 Statistics: Num rows: 825112488 Data size: 35381009137 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: _col0 (type: float) sort order: + Statistics: Num rows: 825112488 Data size: 35381009137 Basic stats: COMPLETE Column stats: NONE value expressions: _col1 (type: struct), _col2 (type: bigint) Reducer 4 Local Work: Map Reduce Local Work Reduce Operator Tree: Group By Operator aggregations: avg(VALUE._col0), count(VALUE._col1), count(DISTINCT KEY._col0:0._col0) mode: mergepartial outputColumnNames: _col0, _col1, _col2 Statistics: Num rows: 1 Data size: 32 Basic stats: COMPLETE Column stats: NONE Map Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {_col0} {_col1} {_col2} 1 {_col0} {_col1} {_col2} keys: 0 1 outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5 input vertices: 0 Reducer 2 Statistics: Num rows: 1 Data size: 26 Basic stats: COMPLETE Column stats: NONE Map Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {_col0} {_col1} {_col2} {_col3} {_col4} {_col5} 1 {_col0} {_col1} {_col2} keys: 0 1 outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8 input vertices: 1 Reducer 6 Statistics: Num rows: 1 Data size: 28 Basic stats: COMPLETE Column stats: NONE Spark HashTable Sink Operator condition expressions: 0 {_col0} {_col1} {_col2} {_col3} {_col4} {_col5} {_col6} {_col7} {_col8} 1 {_col0} {_col1} {_col2} keys: 0 1 Stage: Stage-2 Spark Edges: Reducer 12 <- Map 11 (GROUP PARTITION-LEVEL SORT, 1) Reducer 8 <- Map 7 (GROUP PARTITION-LEVEL SORT, 1) DagName: ec2-user_20141216112222_ede6d4d5-abed-4f59-a7de-ab18f8f69ce1:2 Vertices: Map 11 Map Operator Tree: TableScan alias: store_sales filterExpr: (ss_quantity BETWEEN 26 AND 30 and ((ss_list_price BETWEEN 28 AND 38 or ss_coupon_amt BETWEEN 2513 AND 3513) or ss_wholesale_cost BETWEEN 42 AND 62)) (type: boolean) Statistics: Num rows: 1100149984 Data size: 47174678850 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: (ss_quantity BETWEEN 26 AND 30 and ((ss_list_price BETWEEN 28 AND 38 or ss_coupon_amt BETWEEN 2513 AND 3513) or ss_wholesale_cost BETWEEN 42 AND 62)) (type: boolean) Statistics: Num rows: 825112488 Data size: 10929217832 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: ss_list_price (type: float) outputColumnNames: ss_list_price Statistics: Num rows: 825112488 Data size: 10929217832 Basic stats: COMPLETE Column stats: COMPLETE Group By Operator aggregations: avg(ss_list_price), count(ss_list_price), count(DISTINCT ss_list_price) keys: ss_list_price (type: float) mode: hash outputColumnNames: _col0, _col1, _col2, _col3 Statistics: Num rows: 2366705 Data size: 45704632 Basic stats: COMPLETE Column stats: COMPLETE Reduce Output Operator key expressions: _col0 (type: float) sort order: + Statistics: Num rows: 2366705 Data size: 45704632 Basic stats: COMPLETE Column stats: COMPLETE value expressions: _col1 (type: struct), _col2 (type: bigint) Map 7 Map Operator Tree: TableScan alias: store_sales filterExpr: (ss_quantity BETWEEN 16 AND 20 and ((ss_list_price BETWEEN 142 AND 152 or ss_coupon_amt BETWEEN 3054 AND 4054) or ss_wholesale_cost BETWEEN 80 AND 100)) (type: boolean) Statistics: Num rows: 1100149984 Data size: 47174678850 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: (ss_quantity BETWEEN 16 AND 20 and ((ss_list_price BETWEEN 142 AND 152 or ss_coupon_amt BETWEEN 3054 AND 4054) or ss_wholesale_cost BETWEEN 80 AND 100)) (type: boolean) Statistics: Num rows: 825112488 Data size: 35381009137 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: ss_list_price (type: float) outputColumnNames: ss_list_price Statistics: Num rows: 825112488 Data size: 35381009137 Basic stats: COMPLETE Column stats: NONE Group By Operator aggregations: avg(ss_list_price), count(ss_list_price), count(DISTINCT ss_list_price) keys: ss_list_price (type: float) mode: hash outputColumnNames: _col0, _col1, _col2, _col3 Statistics: Num rows: 825112488 Data size: 35381009137 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: _col0 (type: float) sort order: + Statistics: Num rows: 825112488 Data size: 35381009137 Basic stats: COMPLETE Column stats: NONE value expressions: _col1 (type: struct), _col2 (type: bigint) Reducer 12 Local Work: Map Reduce Local Work Reduce Operator Tree: Group By Operator aggregations: avg(VALUE._col0), count(VALUE._col1), count(DISTINCT KEY._col0:0._col0) mode: mergepartial outputColumnNames: _col0, _col1, _col2 Statistics: Num rows: 1 Data size: 24 Basic stats: COMPLETE Column stats: COMPLETE Spark HashTable Sink Operator condition expressions: 0 {_col0} {_col1} {_col2} {_col3} {_col4} {_col5} {_col6} {_col7} {_col8} {_col9} {_col10} {_col11} {_col12} {_col13} {_col14} 1 {_col0} {_col1} {_col2} keys: 0 1 Reducer 8 Local Work: Map Reduce Local Work Reduce Operator Tree: Group By Operator aggregations: avg(VALUE._col0), count(VALUE._col1), count(DISTINCT KEY._col0:0._col0) mode: mergepartial outputColumnNames: _col0, _col1, _col2 Statistics: Num rows: 1 Data size: 32 Basic stats: COMPLETE Column stats: NONE Map Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {_col0} {_col1} {_col2} {_col3} {_col4} {_col5} {_col6} {_col7} {_col8} 1 {_col0} {_col1} {_col2} keys: 0 1 outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11 input vertices: 0 Reducer 4 Statistics: Num rows: 1 Data size: 30 Basic stats: COMPLETE Column stats: NONE Spark HashTable Sink Operator condition expressions: 0 {_col0} {_col1} {_col2} {_col3} {_col4} {_col5} {_col6} {_col7} {_col8} {_col9} {_col10} {_col11} 1 {_col0} {_col1} {_col2} keys: 0 1 Stage: Stage-1 Spark Edges: Reducer 10 <- Map 9 (GROUP PARTITION-LEVEL SORT, 1) DagName: ec2-user_20141216112222_ede6d4d5-abed-4f59-a7de-ab18f8f69ce1:1 Vertices: Map 9 Map Operator Tree: TableScan alias: store_sales filterExpr: (ss_quantity BETWEEN 21 AND 25 and ((ss_list_price BETWEEN 135 AND 145 or ss_coupon_amt BETWEEN 14180 AND 15180) or ss_wholesale_cost BETWEEN 38 AND 58)) (type: boolean) Statistics: Num rows: 1100149984 Data size: 47174678850 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: (ss_quantity BETWEEN 21 AND 25 and ((ss_list_price BETWEEN 135 AND 145 or ss_coupon_amt BETWEEN 14180 AND 15180) or ss_wholesale_cost BETWEEN 38 AND 58)) (type: boolean) Statistics: Num rows: 825112488 Data size: 35381009137 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: ss_list_price (type: float) outputColumnNames: ss_list_price Statistics: Num rows: 825112488 Data size: 35381009137 Basic stats: COMPLETE Column stats: NONE Group By Operator aggregations: avg(ss_list_price), count(ss_list_price), count(DISTINCT ss_list_price) keys: ss_list_price (type: float) mode: hash outputColumnNames: _col0, _col1, _col2, _col3 Statistics: Num rows: 825112488 Data size: 35381009137 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: _col0 (type: float) sort order: + Statistics: Num rows: 825112488 Data size: 35381009137 Basic stats: COMPLETE Column stats: NONE value expressions: _col1 (type: struct), _col2 (type: bigint) Reducer 10 Local Work: Map Reduce Local Work Reduce Operator Tree: Group By Operator aggregations: avg(VALUE._col0), count(VALUE._col1), count(DISTINCT KEY._col0:0._col0) mode: mergepartial outputColumnNames: _col0, _col1, _col2 Statistics: Num rows: 1 Data size: 32 Basic stats: COMPLETE Column stats: NONE Map Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {_col0} {_col1} {_col2} {_col3} {_col4} {_col5} {_col6} {_col7} {_col8} {_col9} {_col10} {_col11} 1 {_col0} {_col1} {_col2} keys: 0 1 outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12, _col13, _col14 input vertices: 0 Reducer 8 Statistics: Num rows: 1 Data size: 33 Basic stats: COMPLETE Column stats: NONE Map Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {_col0} {_col1} {_col2} {_col3} {_col4} {_col5} {_col6} {_col7} {_col8} {_col9} {_col10} {_col11} {_col12} {_col13} {_col14} 1 {_col0} {_col1} {_col2} keys: 0 1 outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12, _col13, _col14, _col15, _col16, _col17 input vertices: 1 Reducer 12 Statistics: Num rows: 1 Data size: 36 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: _col0 (type: double), _col1 (type: bigint), _col2 (type: bigint), _col3 (type: double), _col4 (type: bigint), _col5 (type: bigint), _col6 (type: double), _col7 (type: bigint), _col8 (type: bigint), _col9 (type: double), _col10 (type: bigint), _col11 (type: bigint), _col12 (type: double), _col13 (type: bigint), _col14 (type: bigint), _col15 (type: double), _col16 (type: bigint), _col17 (type: bigint) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12, _col13, _col14, _col15, _col16, _col17 Statistics: Num rows: 1 Data size: 36 Basic stats: COMPLETE Column stats: NONE Limit Number of rows: 100 Statistics: Num rows: 1 Data size: 36 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false Statistics: Num rows: 1 Data size: 36 Basic stats: COMPLETE Column stats: NONE table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe Stage: Stage-0 Fetch Operator limit: 100 Processor Tree: ListSink