If the key is modified in the map, the outcome of the join is affected. It is necessary to clone the key out of the join collector. A test for this should probably be added to both TestJoinDatamerge and TestDatamerge. That the value in RRs in the tree is also not restored may also be an issue.
I was not able to write a test case for this. I tried verifying by doing key.set(-1) in the mapper. I was seeing problem even if i clone.
Thanks for the unit test Chris. I updated patch with unit test and fixed the bug.
The checks verifying type consistency for keys in general and for values in MultiFilterRecordReader have been removed. Are these not necessary?
The checks verifying type consistency for values in MultiFilterRecordReader are not necessary. Because we can have a join such as override(inner(A,B),A). Removed the consistency checks for values in MultiFilterRecordReader.
Also added a test TestJoinProperties(suggested by Chris) which tests
1. Outer join associativity : outer(outer(A, B), C) == outer(A, outer(B, C)) == outer(A, B, C)
2. Inner join associativity : inner(inner(A, B), C) == inner(A, inner(B,C)) == inner(A, B, C)
3. Override identity, inner consistency : override(inner(A,B),A) = A
Also these tests use different value types in the sources.