Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
Description
Currently, tajo provides three stage for optimizing distinct query aggregation. But it just supports one column for distinct aggregation as follows:
Query1
select a.flag, count(distinct a.id) as cnt, sum(distinct a.id) as total from table1 group by a.flag
If you write two more columns for distinct aggregation, you can't apply optimized distinct aggregation as follows:
Query2
select a.flag, count(distinct a.id) as cnt, sum(distinct a.id) as total , count(distinct a.name) as cnt2, count(distinct a.code) as cnt3 from table1 group by a.flag
In this case, you may see low performance for your query. Thus, we need to improve multiple DISTINCT aggregation. Correctly, we should support three stage for multiple DISTINCT aggregation.
Attachments
Issue Links
- is related to
-
TAJO-601 Improve distinct aggregation query processing
- Resolved