Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
Description
Currently Deduplicate operator only supports first-row deduplication (ordered by proc-time). In scenario of first-n-rows deduplication, the planner has to resort to Rank operator. However, Rank operator is less efficient than Deduplicate due to larger state and more state access.
This issue proposes to extend DeduplicateKeepFirstRowFunction to support first-n-rows deduplication. And the original first-row deduplication would be a special case of first-n-rows deduplication.
Attachments
Attachments
Issue Links
- links to