For a OVER clause, we can have partitioning columns (specified by PARTITION BY) and ordering columns (specified by ORDER BY). In the current implementation, we use the key columns of ReduceSinkOperator (RS) to take care both grouping (for those partitioning columns) and ordering (for those ordering columns). So, we first add all partitioning columns and then add all ordering columns to the key columns of the RS. If we do not specify ordering columns, we will use partitioning columns as ordering columns. Seems we cannot completely remove those duplicate key columns right now (because key columns of RS need to take care both grouping and ordering). But, we can optimize certain cases. For example, if ordering columns are not specified, we do not assign those partition columns to ordering columns.