[HIVE-27480] OFFSET without ORDER BY generates wrong results - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 2.4.0, 3.1.3, 4.0.0-alpha-2
Fix Version/s: None
Component/s: Query Planning
Labels:
- pull-request-available

Description

Without ORDER BY, Hive doesn't add an additional single reducer, and it pushes OFFSET and LIMIT to the original vertex. It can generate incorrect results because OFFSET should be counted globally unlike LIMIT.

We would make the following changes.

To fix the incorrect behavior
To add a new `hive.strict.checks.*` to prevent this usage. Mostly, OFFSET without ORDER BY is meaningless

We can reproduce the issue in the following steps.

Prepare test data

The following SQLs generate a test table with multiple files.

$ beeline -e "
  drop table test;
  create table test(id int);
  insert into test values (1), (2), (3), (4), (5), (6), (7), (8), (9), (10);
  insert into test values (11), (12), (13), (14), (15), (16), (17), (18), (19), (20);
  insert into test values (21), (22), (23), (24), (25), (26), (27), (28), (29), (30);
"

Reproduce the issue

No rows are returned with `limit 10 offset 10`.

$ beeline -e "select * from test limit 10, 10" --hiveconf hive.fetch.task.conversion=none --hiveconf tez.grouping.min-size=1 --hiveconf tez.grouping.max-size=1
...
+----------+
| test.id  |
+----------+
+----------+
No rows selected (13.595 seconds)

Expected behavior

It should return any 10 rows like this.

$ beeline -e "select * from test limit 10, 10"
...
+----------+
| test.id  |
+----------+
| 11       |
| 12       |
| 13       |
| 14       |
| 15       |
| 16       |
| 17       |
| 18       |
| 19       |
| 20       |
+----------+
10 rows selected (0.175 seconds)

Attachments

Issue Links

links to

GitHub Pull Request #4467

GitHub Pull Request #4511

Activity

People

Assignee:: Shohei Okumiya

Reporter:: Shohei Okumiya

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 02/Jul/23 14:07

Updated:: 08/Sep/23 13:51

Resolved:: 08/Sep/23 13:27