Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
0.7.0
-
None
Description
The flatten function in Drill works great for any level of nesting, however the applying nested flatten can complicate query syntax as the # of nested levels increase due to the need to use subqueries, especially when the structure contains maps of repeated lists.
Consider JSON below { "Id":0, "level1":[ { "Id1":1, "level2":[ { "Id2":2, "level3":[ { "Id3":3, "Num":1 } ] } ] } ] }
Below is a query we need to write to flatten the 3 levels.
select count(t2.td.id3) from
(select flatten(t1.tp.level3) as td from
(select flatten(t.ts.level2) as tp from
(select flatten(level1) as ts from dfs.`default`.`/Users/nrentachintala/Downloads/flattenex.json`) as t) as t1) as t2;
An abbreviated/intuitive syntax to apply flatten on maps with arrays would make working with this kind of data much easier.
For ex:
The above query could potentially be rewritten as below.
select flatten(flatten(flatten(level1).level2).level3) as ts from dfs.`default`.`/Users/nrentachintala/Downloads/flattenex.json`
Attachments
Issue Links
- is related to
-
DRILL-2208 Error message must be updated when query contains operations on a flattened column
- Closed