[DRILL-3830] Query with aggregate window functions returns possibly wrong results on large scale data - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Major
Resolution: Invalid
Affects Version/s: 1.2.0
Fix Version/s: None
Component/s: Execution - Relational Operators
Labels:
None
Environment:

10 Performance Nodes
DRILL_MAX_DIRECT_MEMORY=100g
DRILL_INIT_HEAP="8g"
DRILL_MAX_HEAP="8g"
planner.memory.query_max_memory_per_node bumped up to 20 GB
TPC-DS SF 1000 dataset (Parquet)

Description

Results returned by the following two queries slightly differ from those returned by Greenplum DB.

SELECT SUM(ss.ss_net_paid_inc_tax) OVER (PARTITION BY ss.ss_store_sk) FROM store_sales ss LIMIT 1;

SELECT SUM(ss.ss_net_paid_inc_tax) OVER (PARTITION BY ss.ss_store_sk ORDER BY ss.ss_store_sk) FROM store_sales ss LIMIT 2;

Drill:
9.653697131700665E9

Greenplum DB:
9.628946925860903E9

P.S. Both queries return same results

I was unable to reproduce this on smaller scale (tried SF 1). I'll attach plans from both systems.

Attachments

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

gpdb_sf1000_plan.txt
23/Sep/15 21:52
3 kB
Abhishek Girish
gpdb_sf1_plan.txt
23/Sep/15 21:52
3 kB
Abhishek Girish
drill_sf1_plan.txt
23/Sep/15 21:52
6 kB
Abhishek Girish

Activity

People

Assignee:: Abdel Hakim Deneche

Reporter:: Abhishek Girish

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 23/Sep/15 21:47

Updated:: 25/Apr/16 22:17

Resolved:: 27/Sep/15 04:20