Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-3783

Incorrect results : COUNT(<column-name>) over results returned by UNION ALL

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Critical
    • Resolution: Not A Problem
    • Affects Version/s: 1.2.0
    • Fix Version/s: 1.2.0
    • Labels:
      None
    • Environment:

      4 node cluster on CentOS

      Description

      Count over results returned union all query, returns incorrect results. The below query returned an Exception (please se DRILL-2637) that JIRA was marked as fixed, however the query returns incorrect results.

      0: jdbc:drill:schema=dfs.tmp> select count(c1) from (select cast(columns[0] as int) c1 from `testWindow.csv`) union all (select cast(columns[0] as int) c2 from `testWindow.csv`);
      +---------+
      | EXPR$0  |
      +---------+
      | 11      |
      | 100     |
      | 10      |
      | 2       |
      | 50      |
      | 55      |
      | 67      |
      | 113     |
      | 119     |
      | 89      |
      | 57      |
      | 61      |
      +---------+
      12 rows selected (0.753 seconds)
      

      Results returned by the query on LHS and RHS of Union all operator are

      0: jdbc:drill:schema=dfs.tmp> select cast(columns[0] as int) c1 from `testWindow.csv`;
      +------+
      |  c1  |
      +------+
      | 100  |
      | 10   |
      | 2    |
      | 50   |
      | 55   |
      | 67   |
      | 113  |
      | 119  |
      | 89   |
      | 57   |
      | 61   |
      +------+
      11 rows selected (0.197 seconds)
      0: jdbc:drill:schema=dfs.tmp> select cast(columns[0] as int) c2 from `testWindow.csv`;
      +------+
      |  c2  |
      +------+
      | 100  |
      | 10   |
      | 2    |
      | 50   |
      | 55   |
      | 67   |
      | 113  |
      | 119  |
      | 89   |
      | 57   |
      | 61   |
      +------+
      11 rows selected (0.173 seconds)
      

      Note that enclosing the queries within correct parentheses returns correct results. We do not want to return incorrect results to user when the parentheses are missing.

      0: jdbc:drill:schema=dfs.tmp> select count(c1) from ((select cast(columns[0] as int) c1 from `testWindow.csv`) union all (select cast(columns[0] as int) c2 from `testWindow.csv`));
      +---------+
      | EXPR$0  |
      +---------+
      | 22      |
      +---------+
      1 row selected (0.234 seconds)
      

        Attachments

          Activity

            People

            • Assignee:
              khfaraaz Khurram Faraaz
              Reporter:
              khfaraaz Khurram Faraaz
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: