Issue Details (XML | Word | Printable)

Key: HADOOP-475
Type: New Feature New Feature
Status: Closed Closed
Resolution: Won't Fix
Priority: Major Major
Assignee: Vivek Ratan
Reporter: Runping Qi
Votes: 0
Watchers: 3
Operations

If you were logged in you would be able to see more operations.
Hadoop Common

The value iterator to reduce function should be clonable

Created: 24/Aug/06 09:17 PM   Updated: 08/Jul/09 04:51 PM
Return to search
Component/s: None
Affects Version/s: None
Fix Version/s: None

Time Tracking:
Not Specified

Issue Links:
Blocker
 

Resolution Date: 11/Jul/07 12:15 AM


 Description  « Hide
In the current framework, when the user implements the reduce method of Reducer class,
the user can only iterate through the value iterator once.
This makes it hard for the user to perform join-like operations with in the reduce method.
To address problem, one approach is to make the input value iterator clonable. Then the user can iterate the values in different ways.
If the iterator can be reset, then the user can perform nested iterations over the data, thus
carry out join-likeoperations.

The user code in reduce method would be something like:

iterator1 = values.clone();
iterator2 = values.clone();
while (iterator1.hasNext()) {
val1 = iterator1.next();
iterator2.reset();
while (iterator2.hasNext()) { val2 = iterator.next(); do something vased on val1 and val2 ....................... }
}

One possible optimization is that if the values are sorted based on a secondary key,
the reset function can take a secondary key as an argument and reset the iterator to the begining
position of the secondary key. It will be very helpful if there is a utility that returns a list of iterators,
one per secondary key value, from the given iterator:

TreeMap getIteratorsBasedOnSecondaryKey(iterator);

Each entry in the returned map object is a pair of <secondary key, iterator for the values with the same secondary key>.



 All   Comments   Work Log   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
There are no subversion log entries for this issue yet.