Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-2264

Incorrect data when we use aggregate functions with flatten

    XMLWordPrintableJSON

Details

    Description

      git.commit.id.abbrev=6676f2d

      Data Set :

      {
        "uid":1,
        "lst_lst" : [[1,2],[3,4]]
      }
      {
        "uid":2,
        "lst_lst" : [[1,2],[3,4]]
      }
      

      The below query returns incorrect results :

      select uid,MAX( flatten(lst_lst[1]) + flatten(lst_lst[0])) from `temp.json` group by uid, flatten(lst_lst[1]), flatten(lst_lst[0]);
      +------------+------------+
      |    uid     |   EXPR$1   |
      +------------+------------+
      | 1          | 6          |
      | 1          | 6          |
      | 1          | 6          |
      | 1          | 6          |
      | 2          | 6          |
      | 2          | 6          |
      | 2          | 6          |
      | 2          | 6          |
      +------------+------------+
      

      However if we use a sub query, drill returns the right data

      select uid, MAX(l1+l2) from (select uid,flatten(lst_lst[1]) l1, flatten(lst_lst[0]) l2 from `temp.json`) sub group by uid, l1, l2;
      +------------+------------+
      |    uid     |   EXPR$1   |
      +------------+------------+
      | 1          | 4          |
      | 1          | 5          |
      | 1          | 5          |
      | 1          | 6          |
      | 2          | 4          |
      | 2          | 5          |
      | 2          | 5          |
      | 2          | 6          |
      +------------+------------+
      

      Also using a single flatten yields proper results

      select uid,MAX(flatten(lst_lst[0])) from `temp.json` group by uid, flatten(lst_lst[0]);
      +------------+------------+
      |    uid     |   EXPR$1   |
      +------------+------------+
      | 1          | 1          |
      | 1          | 2          |
      | 2          | 1          |
      | 2          | 2          |
      +------------+------------+
      

      Marked it as critical since we return in-correct data. Let me know if you have any other questions

      Attachments

        Issue Links

          Activity

            People

              jaltekruse Jason Altekruse
              rkins Rahul Kumar Challapalli
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: