[HIVE-4002] Fetch task aggregation for simple group by query - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Closed
Priority: Minor
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 0.12.0
Component/s: Query Processor
Labels:
None

Description

Aggregation queries with no group-by clause (for example, select count from src) executes final aggregation in single reduce task. But it's too small even for single reducer because the most of UDAF generates just single row for map aggregation. If final fetch task can aggregate outputs from map tasks, shuffling time can be removed.

This optimization transforms operator tree something like,

TS-FIL-SEL-GBY1-RS-GBY2-SEL-FS + FETCH-TASK

into

TS-FIL-SEL-GBY1-FS + FETCH-TASK(GBY2-SEL-LS)

With the patch, time taken for auto_join_filters.q test reduced to 6 min (10 min, before).

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

HIVE-4002.D8739.1.patch
21/Feb/13 03:22
83 kB
Phabricator
HIVE-4002.D8739.2.patch
09/Jul/13 03:49
83 kB
Phabricator
HIVE-4002.D8739.3.patch
23/Aug/13 01:38
81 kB
Phabricator
HIVE-4002.D8739.4.patch
27/Aug/13 04:20
81 kB
Phabricator
HIVE-4002.patch
01/Sep/13 16:41
77 kB
Yin Huai

Issue Links

is related to

HIVE-5793 Update hive-default.xml.template for HIVE-4002

Resolved

Activity

People

Assignee:: Navis Ryu

Reporter:: Navis Ryu

Votes:: 0 Vote for this issue

Watchers:: 7 Start watching this issue

Dates

Created:: 08/Feb/13 09:03

Updated:: 05/Jul/14 23:38

Resolved:: 01/Sep/13 19:30