Julian Hyde, +1 to the idea of adding a maximum row count, I think we had even discussed about that previously.
I have just checked the latest patch by Pengcheng Xiong. The patch relies on estimated row count to obtain the maximum row count. I think we should rather provide an actual different metadata provider and method.
In some cases, estimated row count might be equal to max row count, but not in all cases. For instance, consider a join operator with two inputs that produce n and m tuples, respectively. If no PK/FK relationship is present, we rely on estimated selectivity to calculate estimated row count, but the maximum row count should not be the result of applying Math.ceil to estimated row count. Rather, the max row count should be n * m.
We do not need to implement maximum row count for every operator in this patch; we can delegate the implementation of the method to the different systems using Calcite. But at least, the patch should provide the implementation for max row count for the Sort operator (that might be equal to estimated row count in this case) so it works as expected with LimitJoinTranspose and LimitUnionTranspose rules.