Affects Version/s: None
Fix Version/s: 0.5.0
While performing the following query on two text files, the projected column does not get pushed into the scan.
explain plan for select t1.columns from dfs.`/tmp/t1.csv` t1, dfs.`/tmp/t2.csv` t2 where t1.columns = t2.columns;
00-01 Project(EXPR$0=[ITEM($0, 1)])
00-02 HashJoin(condition=[=($1, $2)], joinType=[inner])
00-04 Project(columns=[$0], $f2=[ITEM($0, 0)])
00-06 Scan(groupscan=[EasyGroupScan [selectionRoot=/tmp/t1.csv, columns = [SchemaPath [`columns`], SchemaPath [`columns`]]]])
00-05 Project($f2=[ITEM($0, 0)])
00-07 Scan(groupscan=[EasyGroupScan [selectionRoot=/tmp/t2.csv, columns = [SchemaPath [`columns`]]]])
In the above plan (00-06) we see that the scan contains the following column projections pushed into scan: 'columns' and 'columns'.
We should not push 'columns' into the scan, instead push 'columns' which is the projected column into the scan.