AFFECTED_FUNCTIONALITY: INNER JOIN
ISSUE_DESCRIPTION: There were added new Json data types in
DRILL-5919: NaN, Infinity, -Infinity.
During testing activities, it was detected a bit strange behavior of INNER JOIN operator - different query results in almost the same queries.
select distinct t.name, tt.name from dfs.tmp.`ObjsX.json` t inner join dfs.tmp.`ObjsX.json` tt on t.attr4 = tt.attr4
select distinct t.name from dfs.tmp.`ObjsX.json` t inner join dfs.tmp.`ObjsX.json` tt on t.attr4 = tt.attr4
Query1 differs from Query2 by 1 columns only:
- In Query1 - 2 columns are selected - t.name, tt.name
- In Query2 - 1 column is selected - t.name
However Query1/Query2 return completely different results:
- Query1 returns
name name0 object2 object2 object2 object3 object2 object4 object3 object2 object3 object3 object3 object4 object4 object2 object4 object3 object4 object4
This result seems to be correct.
- Query2 returns No result found, not expected:
name object2 object3 object4
No result found
NB!: the issue appears only if tables are JOINed by a column which contains newly-added data types (NaN, Infinity, -Infinity). The issue is not reproducible is a user is JOINing tables by a column containing other data types
- is caused by
DRILL-5919 Add non-numeric support for JSON processing