Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
Impala 1.1
-
None
-
None
Description
It appears we don't always set numNodes properly for all PlanNodes which may cause insert plans with few partitions to be incorrectly hash repartitioned.
Two places that Anty Rao (the bug reporter) mentioned explicitly were:
1. Exchange Nodes
2. Aggregation Nodes in the code path without distinct aggregation
We should also add more Preconditions checks to make sure stats are always set in the proper places.
Here's the query Anty mentioned:
INSERT OVERWRITE TABLE tableName PARTITION (dt='20130729') SELECT c1 , CASE WHEN substr(c2,1,2)<>'86' THEN 3 WHEN city_id is not null then 1 ELSE 2 END as c3, c4, c2 , sum(c5 + c6), sum(case WHEN statistic_code like '2%' then 1 else 0 end) as success_count FROM dw_wap a where url is not null GROUP BY c1, c3, c4, c2 ;