Details
-
New Feature
-
Status: Closed
-
Major
-
Resolution: Duplicate
-
1.1.0
-
None
-
None
Description
Flink currently provides inner- and outer-joins as well as cogroup and the non-keyed cross. JoinOperator hints at future support for semi- and anti-joins.
Many Gelly algorithms perform a self-join [0]. Still pending reviews, FLINK-3768 performs a self-join on non-skewed data in TriangleListing.java and FLINK-3780 performs a self-join on skewed data in JaccardSimilarity.java. A SelfJoinHint will select between skewed and non-skewed implementations.
The object-reuse-disabled case can be simply handled with a new Operator. The object-reuse-enabled case requires either CopyableValue types (as in the code above) or a custom driver which has access to the serializer (or making the serializer accessible to rich functions, and I think there be dragons).
If the idea of a self-join is agreeable, I'd like to work out a rough implementation and go from there.
Attachments
Issue Links
- duplicates
-
FLINK-1267 Add crossGroup operator
- Reopened