Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Won't Fix
-
1.0.0
Description
Two parquet tables:
0: jdbc:drill:schema=dfs> select * from t1; +------------+------------+------------+ | a1 | b1 | c1 | +------------+------------+------------+ | 1 | aaaaa | 2015-01-01 | | 2 | bbbbb | 2015-01-02 | | 3 | ccccc | 2015-01-03 | | 4 | null | 2015-01-04 | | 5 | eeeee | 2015-01-05 | | 6 | fffff | 2015-01-06 | | 7 | ggggg | 2015-01-07 | | null | hhhhh | 2015-01-08 | | 9 | iiiii | null | | 10 | jjjjj | 2015-01-10 | +------------+------------+------------+ 10 rows selected (0.109 seconds) 0: jdbc:drill:schema=dfs> select * from t3; +------------+------------+------------+ | a3 | b3 | c3 | +------------+------------+------------+ | 1 | aaaaa | 2015-01-01 | | 2 | bbbbb | 2015-01-02 | | 3 | ccccc | 2015-01-03 | +------------+------------+------------+ 3 rows selected (0.106 seconds)
When column name is not qualified with the table name in the join condition, we get wrong result:
0: jdbc:drill:schema=dfs> select * from t1 where exists (select * from t3 where a1 = a3); +------------+------------+------------+ | a1 | b1 | c1 | +------------+------------+------------+ | 1 | aaaaa | 2015-01-01 | | 2 | bbbbb | 2015-01-02 | | 3 | ccccc | 2015-01-03 | | 4 | null | 2015-01-04 | | 5 | eeeee | 2015-01-05 | | 6 | fffff | 2015-01-06 | | 7 | ggggg | 2015-01-07 | | null | hhhhh | 2015-01-08 | | 9 | iiiii | null | | 10 | jjjjj | 2015-01-10 | +------------+------------+------------+ 10 rows selected (0.177 seconds)
Works correctly when column names are fully qualified ...
0: jdbc:drill:schema=dfs> select * from t1 where exists (select * from t3 where t1.a1 = t3.a3); +------------+------------+------------+ | a1 | b1 | c1 | +------------+------------+------------+ | 1 | aaaaa | 2015-01-01 | | 2 | bbbbb | 2015-01-02 | | 3 | ccccc | 2015-01-03 | +------------+------------+------------+ 3 rows selected (0.353 seconds)
Plan for the query with the wrong result:
0: jdbc:drill:schema=dfs> explain plan for select * from t1 where exists (select * from t3 where a1 = a3); +------------+------------+ | text | json | +------------+------------+ | 00-00 Screen 00-01 Project(*=[$0]) 00-02 Project(T124¦¦*=[$0]) 00-03 NestedLoopJoin(condition=[true], joinType=[inner]) 00-05 Project(T124¦¦*=[$0]) 00-07 Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:/drill/testdata/aggregation/t1]], selectionRoot=/drill/testdata/aggregation/t1, numFiles=1, columns=[`*`]]]) 00-04 SelectionVectorRemover 00-06 Filter(condition=[IS NOT NULL($0)]) 00-08 StreamAgg(group=[{}], agg#0=[MIN($0)]) 00-09 Project($f0=[true]) 00-10 SelectionVectorRemover 00-11 Filter(condition=[=($0, $1)]) 00-12 Project(a1=[$1], a3=[$0]) 00-13 Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:/drill/testdata/aggregation/t3]], selectionRoot=/drill/testdata/aggregation/t3, numFiles=1, columns=[`a1`, `a3`]]])
Attachments
Issue Links
- is related to
-
DRILL-1868 Provide options to handle nonexistent columns in schema-less query
- Open