Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
Description
While performing the following query on two text files, the projected column does not get pushed into the scan.
explain plan for select t1.columns[1] from dfs.`/tmp/t1.csv` t1, dfs.`/tmp/t2.csv` t2 where t1.columns[0] = t2.columns[0];
00-00 Screen
00-01 Project(EXPR$0=[ITEM($0, 1)])
00-02 HashJoin(condition=[=($1, $2)], joinType=[inner])
00-04 Project(columns=[$0], $f2=[ITEM($0, 0)])
00-06 Scan(groupscan=[EasyGroupScan [selectionRoot=/tmp/t1.csv, columns = [SchemaPath [`columns`], SchemaPath [`columns`[0]]]]])
00-03 Project($f20=[$0])
00-05 Project($f2=[ITEM($0, 0)])
00-07 Scan(groupscan=[EasyGroupScan [selectionRoot=/tmp/t2.csv, columns = [SchemaPath [`columns`[0]]]]])
In the above plan (00-06) we see that the scan contains the following column projections pushed into scan: 'columns' and 'columns[0]'.
We should not push 'columns' into the scan, instead push 'columns[1]' which is the projected column into the scan.