I looked at the code change. I think it makes sense to move the logic of pushing join expression into a separate rule, and it's up to each system to decide whether turn on/off such rule in their planner. The code change looks fine to me ( one minor comment).
I'm a bit surprised that it caused performance regression in Hive by pushing expression into project below the join, though. I guess under two scenarios such push down would cause performance overhead:
1 ) The join condition itself does not have filtering, or very less filtering. As such, it does not matter much whether the filter is applied in join operator, or in the filter operator after join.
2) the join condition evaluation applies short-circuit evaluation optimization. As such, it might be possible to skip some expensive expression. In contrast, if we push down the expression, we will end up with evaluating every expression always.
I guess such scenarios probably be reflected in the costing; it's up to the costing to decide which way to go, while the rule's job is to enumerate the possible different choices.
Also, if the query' join is ANSI-sql style; join condition is in "ON" clause, then Calcite will do such pushdown in SqlToRelConverter always, before the opt phases kicks in.