Currently `ShuffledHashJoin.outputPartitioning` inherits from `HashJoin.outputPartitioning`, which only preserves stream side partitioning:
This loses build side partitioning information, and causes extra shuffle if there's another join / group-by after this join.
Current physical plan (having an extra shuffle on `k1` before aggregate)
Ideal physical plan (no shuffle on `k1` before aggregate)
This can be fixed by overriding `outputPartitioning` method in `ShuffledHashJoinExec`, similar to `SortMergeJoinExec`.