Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-2801

ORDER BY produces extra records

    XMLWordPrintableJSON

Details

    Description

      Running in embedded mode on my mac.

      $ wc -w data.csv
         50000 data.csv
      

      Here's the query:

      0: jdbc:drill:zk=local> SELECT count(*) FROM dfs.`data.csv`;
      +------------+
      |   EXPR$0   |
      +------------+
      | 50000      |
      +------------+
      1 row selected (0.223 seconds)
      0: jdbc:drill:zk=local> SELECT columns[0] FROM dfs.`data.csv` ORDER BY columns[0];
      +------------+
      |   EXPR$0   |
      +------------+
      ...
      | 6          |
      +------------+
      50,001 rows selected (0.928 seconds)
      0: jdbc:drill:zk=local> SELECT tab.col, COUNT(tab.col) FROM (SELECT columns[0] col FROM dfs.`data.csv` ORDER BY columns[0]) tab GROUP BY tab.col;
      +------------+------------+
      |     col      |   EXPR$1   |
      +------------+------------+
      | 2          | 10000      |
      | 3          | 10000      |
      | 4          | 10000      |
      | 5          | 10001      |
      | 6          | 10000      |
      +------------+------------+
      5 rows selected (0.704 seconds)
      

      Attachments

        1. data.csv
          98 kB
          Sudheesh Katkam

        Issue Links

          Activity

            People

              sphillips Steven Phillips
              sudheeshkatkam Sudheesh Katkam
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: