Details
-
Bug
-
Status: Resolved
-
Critical
-
Resolution: Later
-
None
-
None
Description
git.commit.id.abbrev=3d863b5
Data Set :
{ "id" : 1, "lst" : [1,2,3,4] }
The below query returns 4 rows instead of 1. The expected behavior in this case is not documented properly
select id from `data.json` where 2 in (select flatten(lst) from `data.json`) order by flatten(lst); +------------+ | id | +------------+ | 1 | | 1 | | 1 | | 1 | +------------+
The below projects a flatten.
0: jdbc:drill:schema=dfs_eea> select id, flatten(lst) from `temp.json` where 2 in (select flatten(lst) from `temp.json`) order by flatten(lst); +------------+------------+ | id | EXPR$1 | +------------+------------+ | 1 | 1 | | 1 | 2 | | 1 | 3 | | 1 | 4 | +------------+------------+
We can agree on one of the 3 possibilites when flatten is not projected:
1. Irrespective of whether flatten is in the select list or not, we would still return more records based on flatten in the order by
2. Flatten in the order by clause does not change the no of records we return
3. Using flatten in an order by (or probably group by) is not supported
Whatever we agree on, we should document it more clearly. Let me know your thoughts
Attachments
Issue Links
- relates to
-
DRILL-2183 Flatten behavior not consistent with rest of drill when we use it on a non-existent field
- Open
-
DRILL-2167 Order by on a repeated index from the output of a flatten on large no of records results in incorrect results
- Resolved
-
DRILL-2228 Projecting '*' returns all nulls when we have flatten in a filter and order by
- Resolved
-
DRILL-2264 Incorrect data when we use aggregate functions with flatten
- Resolved
-
DRILL-2181 Throw proper error message when flatten is used within an 'order by' or 'group by'
- Closed