In addition to (or perhaps instead of) the tests you added to RelOptRulesTest, can you add tests to RexProgramTest.testSimplify? These tests are more specific and easier to write and debug than RelOptRulesTest tests.
Can you review null semantics carefully? "a > 5" is not equivalent to "NOT(a <= 5)" when there are nulls involved. There's not an easy solution. Usually boolean expressions have an implicit "... IS TRUE" so it doesn't matter, but it's hard to be sure you're in that situation. NullSafeVisitor may help, by making null semantics explicit; after NullSafeVisitor has made a pass, a would be converted to CAST(a AS ... NOT NULL) and therefore you can see that nulls are not in play.
Do the HashSets in simplifyAnd2 make the behavior non-deterministic? (How ironic.) For instance, you iterate over notNullOperands. There's LinkedHashSet but also consider using the humble ArrayList (it's cheap). I don't know whether you expect to see term lists of a size where O( n ) contains would be a problem.
Since RexNode doesn't override equals and hashCode maybe you should be building sets of strings.
Your RexUtil.negate(RexCall) method does the same thing as NullSafeVisitor.negate(RexCall call) that I just added. Can you please combine. (Your negate method returns null in theory but not in practice. It should stop being wimpy and just throw.)
Similarly, your RexUtil.invert method should leverage SqlKind.reverse or RexImplicationChecker.InputUsageFinder.reverse.
On a couple of occasions you write termsSet.containsAll(new HashSet(collection)) where you could write termsSet.containsAll(collection).