[DRILL-2801] ORDER BY produces extra records - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Critical
Resolution: Duplicate
Affects Version/s: 0.8.0
Fix Version/s: 1.0.0
Component/s: Execution - Relational Operators
Labels:
None

Description

Running in embedded mode on my mac.

$ wc -w data.csv
   50000 data.csv

Here's the query:

0: jdbc:drill:zk=local> SELECT count(*) FROM dfs.`data.csv`;
+------------+
|   EXPR$0   |
+------------+
| 50000      |
+------------+
1 row selected (0.223 seconds)
0: jdbc:drill:zk=local> SELECT columns[0] FROM dfs.`data.csv` ORDER BY columns[0];
+------------+
|   EXPR$0   |
+------------+
...
| 6          |
+------------+
50,001 rows selected (0.928 seconds)
0: jdbc:drill:zk=local> SELECT tab.col, COUNT(tab.col) FROM (SELECT columns[0] col FROM dfs.`data.csv` ORDER BY columns[0]) tab GROUP BY tab.col;
+------------+------------+
|     col      |   EXPR$1   |
+------------+------------+
| 2          | 10000      |
| 3          | 10000      |
| 4          | 10000      |
| 5          | 10001      |
| 6          | 10000      |
+------------+------------+
5 rows selected (0.704 seconds)

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

data.csv
15/Apr/15 19:11
98 kB
Sudheesh Katkam

Issue Links

duplicates

DRILL-2083 order by on large dataset returns wrong results

Closed

Activity

People

Assignee:: Steven Phillips

Reporter:: Sudheesh Katkam

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 15/Apr/15 19:11

Updated:: 28/Apr/15 18:27

Resolved:: 28/Apr/15 18:27