Details
-
Improvement
-
Status: Open
-
Major
-
Resolution: Unresolved
-
1.26.0
-
None
-
None
Description
Currently, EnumerableDefaults#union buffers all the rows before it returns the first of them
Pros:
1) Faster iteration in case enumerable is queried multiple times
Cons:
1) The implementation does not work with infinite streams
2) Keeps memory even after iteration is finished
—
An alternative might be something like
public static <TSource> Enumerable<TSource> union(Enumerable<TSource> source0, Enumerable<TSource> source1) { Enumerable<TSource> unionAll = concat(source0, source1); return new AbstractEnumerable<TSource>() { @Override public Enumerator<TSource> enumerator() { Set<TSource> set = new HashSet<>(); return EnumerableDefaults.where(unionAll, set::add).enumerator(); } }; }
Pros:
1) Supports infinite streams
2) In theory, it could reset hashSet after iteration finishes
Cons:
1) Slower iteration in case enumerable is queried multiple times (hashSet is rebuilt every time)
2) concat+abstractenumerable might const CPU cycles
Attachments
Issue Links
- is related to
-
CALCITE-3221 Add MergeUnion operator in Enumerable convention
- Closed