I'd like to see a modular design rather than intermingle different concept together.
In that case we can extend and add a POJoinLocalRearrange that handles join specific conditions like this. It would not be mixing up POLocalRearrange then.
I don't feel it is hard to find the join key in the logical optimizer and adding a filter on it.
Adding a extra filter operator for 3 lines of check will definitely impact performance when we are dealing with billions of records. We recently had a user who added is null bincond checks for lot of columns in his foreach which dealt with 10+billions of records and it took extra 40+ minutes. Filter and foreach are what we are trying to optimize in PIG-3764 with bytecode generation. As we are trying to improve performance everywhere and trying to save milliseconds we should not be doing this unless it is a major or complicated change in which case it will be cleaner to keep it separate.