[DRILL-2232] Flatten functionality not well defined when we use flatten in an order by without projecting it - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Critical
Resolution: Later
Affects Version/s: None
Fix Version/s: 1.0.0
Component/s: Execution - Relational Operators
Labels:
None

Description

git.commit.id.abbrev=3d863b5

Data Set :

{
  "id" : 1,
  "lst" : [1,2,3,4]
}

The below query returns 4 rows instead of 1. The expected behavior in this case is not documented properly

select id from `data.json` where 2 in (select flatten(lst) from `data.json`) order by flatten(lst);
+------------+
|     id     |
+------------+
| 1          |
| 1          |
| 1          |
| 1          |
+------------+

The below projects a flatten.

0: jdbc:drill:schema=dfs_eea> select id, flatten(lst) from `temp.json` where 2 in (select flatten(lst) from `temp.json`) order by flatten(lst);
+------------+------------+
|     id     |   EXPR$1   |
+------------+------------+
| 1          | 1          |
| 1          | 2          |
| 1          | 3          |
| 1          | 4          |
+------------+------------+

We can agree on one of the 3 possibilites when flatten is not projected:

1. Irrespective of whether flatten is in the select list or not, we would still return more records based on flatten in the order by
2. Flatten in the order by clause does not change the no of records we return
3. Using flatten in an order by (or probably group by) is not supported

Whatever we agree on, we should document it more clearly. Let me know your thoughts

Attachments

Issue Links

relates to

DRILL-2183 Flatten behavior not consistent with rest of drill when we use it on a non-existent field

Open

DRILL-2167 Order by on a repeated index from the output of a flatten on large no of records results in incorrect results

Resolved

DRILL-2228 Projecting '*' returns all nulls when we have flatten in a filter and order by

Resolved

DRILL-2264 Incorrect data when we use aggregate functions with flatten

Resolved

DRILL-2181 Throw proper error message when flatten is used within an 'order by' or 'group by'

Closed

Activity

People

Assignee:: Jason Altekruse

Reporter:: Rahul Kumar Challapalli

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 12/Feb/15 22:19

Updated:: 05/May/15 00:37

Resolved:: 05/May/15 00:36