Details
-
New Feature
-
Status: Reopened
-
Not a Priority
-
Resolution: Unresolved
-
0.7.0-incubating
-
None
Description
A common operator is to pair-wise compare or combine all elements of a group (there were two questions about this on the user mailing list, recently). Right now, this can be done in two ways:
1. groupReduce: consume and store the complete iterator in memory and build all pairs
2. do a self-Join: the engine builds all pairs of the full symmetric Cartesian product.
Both approaches have drawbacks. The groupReduce variant requires that the full group fits into memory and is more cumbersome to implement for the user, but pairs can be arbitrarily built. The self-Join approach pushes most of the work into the system, but the execution strategy does not treat the self-join different from a regular join (both identical inputs are shuffled, etc.) and always builds the full symmetric Cartesian product.
I propose to add a dedicated crossGroup() operator, that offers this functionality in a proper way.
Attachments
Issue Links
- is duplicated by
-
FLINK-3910 New self-join operator
- Closed
- mentioned in
-
Page Loading...