Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
0.7.0
-
None
Description
The following query:
select t1.id as m_id, count(t2.id) as totals from table1 t1 left outer join table2 t2 on (t1.id = t2.id) where (datepartition == '2011-11-08-00') group by t1.id;
should prune to just a single partition (datepartition == '2011-11-08-00') . However, the filter is being applied in the reducer, so a full table scan is being done as part of the map.
One can get the correct behavior by pushing the filter into a select statement within the join itself:
select t1.id as m_id, count(p.id) as totals from table1 t1 left outer join (select * from table2 where datepartition == '2011-11-08-00') t2 on (t1.id = t2.id) group by t1.id;