Details
-
Bug
-
Status: Closed
-
Critical
-
Resolution: Not A Problem
-
1.2.0
-
None
-
4 node cluster on CentOS
Description
Count over results returned union all query, returns incorrect results. The below query returned an Exception (please se DRILL-2637) that JIRA was marked as fixed, however the query returns incorrect results.
0: jdbc:drill:schema=dfs.tmp> select count(c1) from (select cast(columns[0] as int) c1 from `testWindow.csv`) union all (select cast(columns[0] as int) c2 from `testWindow.csv`); +---------+ | EXPR$0 | +---------+ | 11 | | 100 | | 10 | | 2 | | 50 | | 55 | | 67 | | 113 | | 119 | | 89 | | 57 | | 61 | +---------+ 12 rows selected (0.753 seconds)
Results returned by the query on LHS and RHS of Union all operator are
0: jdbc:drill:schema=dfs.tmp> select cast(columns[0] as int) c1 from `testWindow.csv`; +------+ | c1 | +------+ | 100 | | 10 | | 2 | | 50 | | 55 | | 67 | | 113 | | 119 | | 89 | | 57 | | 61 | +------+ 11 rows selected (0.197 seconds) 0: jdbc:drill:schema=dfs.tmp> select cast(columns[0] as int) c2 from `testWindow.csv`; +------+ | c2 | +------+ | 100 | | 10 | | 2 | | 50 | | 55 | | 67 | | 113 | | 119 | | 89 | | 57 | | 61 | +------+ 11 rows selected (0.173 seconds)
Note that enclosing the queries within correct parentheses returns correct results. We do not want to return incorrect results to user when the parentheses are missing.
0: jdbc:drill:schema=dfs.tmp> select count(c1) from ((select cast(columns[0] as int) c1 from `testWindow.csv`) union all (select cast(columns[0] as int) c2 from `testWindow.csv`)); +---------+ | EXPR$0 | +---------+ | 22 | +---------+ 1 row selected (0.234 seconds)