Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-2801

ORDER BY produces extra records

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    Description

      Running in embedded mode on my mac.

      $ wc -w data.csv
         50000 data.csv
      

      Here's the query:

      0: jdbc:drill:zk=local> SELECT count(*) FROM dfs.`data.csv`;
      +------------+
      |   EXPR$0   |
      +------------+
      | 50000      |
      +------------+
      1 row selected (0.223 seconds)
      0: jdbc:drill:zk=local> SELECT columns[0] FROM dfs.`data.csv` ORDER BY columns[0];
      +------------+
      |   EXPR$0   |
      +------------+
      ...
      | 6          |
      +------------+
      50,001 rows selected (0.928 seconds)
      0: jdbc:drill:zk=local> SELECT tab.col, COUNT(tab.col) FROM (SELECT columns[0] col FROM dfs.`data.csv` ORDER BY columns[0]) tab GROUP BY tab.col;
      +------------+------------+
      |     col      |   EXPR$1   |
      +------------+------------+
      | 2          | 10000      |
      | 3          | 10000      |
      | 4          | 10000      |
      | 5          | 10001      |
      | 6          | 10000      |
      +------------+------------+
      5 rows selected (0.704 seconds)
      

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            sphillips Steven Phillips
            sudheeshkatkam Sudheesh Katkam
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment