[PIG-3562] Implement combiner optimizations for DISTINCT - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Sub-task
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: tez-branch
Fix Version/s: tez-branch
Component/s: tez
Labels:
None

Description

Currently, DISTINCT is implemented in a straightforward manner per https://issues.apache.org/jira/browse/PIG-3538.

However, we can implement two types of combiner optimizations for DISTINCT, just as the MRCompiler does for map-reduce:
1. A simple DistinctCombiner that throws away the duplicate tuples
2. An optimizer that transforms certain uses of DISTINCT into an algebraic udf form

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

PIG-3562-0.patch
08/Jan/14 01:48
11 kB
Alex Bain

Activity

People

Assignee:: Alex Bain

Reporter:: Alex Bain

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 05/Nov/13 19:34

Updated:: 21/Nov/14 05:59

Resolved:: 08/Jan/14 23:40