Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-2783

Abbreviated Flatten syntax for multi-level nested elements

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 0.7.0
    • Future
    • Storage - JSON
    • None

    Description

      The flatten function in Drill works great for any level of nesting, however the applying nested flatten can complicate query syntax as the # of nested levels increase due to the need to use subqueries, especially when the structure contains maps of repeated lists.

      Consider JSON below
       {
         "Id":0,
         "level1":[
            {
               "Id1":1,
               "level2":[
                  {
                     "Id2":2,
                     "level3":[
                        {
                           "Id3":3,
                           "Num":1
                        }
                     ]
                  }
               ]
            }
         ]
      }
      

      Below is a query we need to write to flatten the 3 levels.

      select count(t2.td.id3) from
      (select flatten(t1.tp.level3) as td from
      (select flatten(t.ts.level2) as tp from
      (select flatten(level1) as ts from dfs.`default`.`/Users/nrentachintala/Downloads/flattenex.json`) as t) as t1) as t2;

      An abbreviated/intuitive syntax to apply flatten on maps with arrays would make working with this kind of data much easier.
      For ex:
      The above query could potentially be rewritten as below.
      select flatten(flatten(flatten(level1).level2).level3) as ts from dfs.`default`.`/Users/nrentachintala/Downloads/flattenex.json`

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              Neeraja Neeraja
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated: