Issue Details (XML | Word | Printable)

Key: HADOOP-3526
Type: Bug Bug
Status: Resolved Resolved
Resolution: Fixed
Priority: Blocker Blocker
Assignee: Spyros Blanas
Reporter: Spyros Blanas
Votes: 0
Watchers: 1
Operations

If you were logged in you would be able to see more operations.
Hadoop Common

contrib/data_join doesn't work

Created: 10/Jun/08 06:05 PM   Updated: 22/Aug/08 07:48 PM
Return to search
Component/s: None
Affects Version/s: 0.17.0
Fix Version/s: 0.17.1

Time Tracking:
Not Specified

File Attachments:
  Size
File Licensed for inclusion in ASF works patch 2008-06-16 09:17 PM Spyros Blanas 1 kB
Issue Links:
Reference
 

Hadoop Flags: Reviewed
Resolution Date: 18/Jun/08 10:51 PM


 Description  « Hide
The example in the README.txt in contrig/data_join/src/example doesn't work in 0.17.0. It works perfectly in 0.16.4.

 All   Comments   Work Log   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Spyros Blanas added a comment - 12/Jun/08 06:38 PM
The issue is related to HADOOP-3522. The patch just clones the objects returned from the ValueIterator and forces the user-provided classes to implement a clone() method as well.

Runping Qi added a comment - 13/Jun/08 09:23 PM

Since applications using this package break with 0.17,
I think this is a blocker for 0.17.1


Chris Douglas added a comment - 14/Jun/08 02:41 AM
Spyros-

This looks like the right idea; a couple suggestions:

  • It would break less existing code if this used WritableUtils::clone instead of adding an abstract method
  • The tag at DataJoinReducerBase::107 doesn't need to be cloned

Nigel Daley added a comment - 14/Jun/08 07:09 PM
-1 as the patch should have a unit test.

Spyros Blanas added a comment - 16/Jun/08 09:17 PM
Chris,
The tag needs cloning, because it's stored as a key in the SortedMap. Otherwise, the Text.equals() method always returns true (see HADOOP-3522).
As for the WritableUtils.clone() suggestion, I upload a new version of the patch which incorporates it.

Nigel,
Unfortunately there is no existing JUnit test covering the contributed data_join code and I don't know how to change the build.xml file to run the new tests. Also, building a JUnit testing framework for generic map-reduce jobs like data_join is much more difficult than targeting specific functions, as done in the core. I would be happy to create a test if you can point me to some code that tests a generic map/reduce job that I can tweak.


Nigel Daley added a comment - 16/Jun/08 10:16 PM
Spyros, Chris has agreed to write some unit tests for Hadoop 0.18.

Chris Douglas added a comment - 16/Jun/08 10:34 PM

The tag needs cloning, because it's stored as a key in the SortedMap. Otherwise, the Text.equals() method always returns true

Sorry, I mean it doesn't need to be cloned there, for the lookup. Since the record needs to be cloned anyway and it includes the tag, it should be sufficient to clone the record and use the tag for the map key.


Chris Douglas added a comment - 18/Jun/08 10:51 PM
I just committed this. Thanks, Spyros

Hudson added a comment - 19/Jun/08 12:35 PM