[PIG-2651] Provide a much easier to use accumulator interface - ASF JIRA

XML

Word

Printable

JSON

Details

Type: New Feature
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 0.11
Component/s: None
Labels:
None

Hadoop Flags:

Reviewed

Description

This introduces a new interface, IteratingAccumulatorEvalFunc (that name is NOT final...). The cool thing about this patch is that it is built purely on top of the existing Accumulator code (well, it uses ~~PIG-2066~~, but it could easily work without it). That is to say, it's an easier way to write accumulators without having to fork the Pig codebase.

The downside is that the only way I am able to provide such a clean interface is by using a second thread. I need to explore any potential performance implications, but given that most of the easy to use Pig stuff has performance implications, I think as long as we measure and and document them, it's worth the much more usable interface. Plus I don't think it will be too bad as one thread does the heavy lifting, while another just ferries values in between. SUM could now be written as:

public class SUM extends IteratingAccumulatorEvalFunc<Long> {
    public Long exec(Iterator<Tuple> it) throws IOException {
        long sum = 0;

        while (it.hasNext()) {
            sum += (Long)it.next().get(0);
        }

        return sum;
    }
}

Besides performance tests, I need to figure out how to properly test this sort of thing. I particularly welcome advice on that front.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

PIG-2651-0.patch
13/Apr/12 08:25
16 kB
Jonathan Coveney
PIG-2651-1.patch
11/May/12 01:13
31 kB
Jonathan Coveney
PIG-2651-2.patch
31/May/12 21:20
14 kB
Jonathan Coveney

Issue Links

incorporates

PIG-2066 Accumulators should be able to early-terminate

Closed

Activity

People

Assignee:: Jonathan Coveney

Reporter:: Jonathan Coveney

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 13/Apr/12 08:18

Updated:: 22/Feb/13 04:54

Resolved:: 08/Jun/12 22:38