Uploaded image for project: 'Calcite'
  1. Calcite
  2. CALCITE-4480

Make EnumerableDefaults#union a non-blocking operation

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 1.26.0
    • None
    • core
    • None

    Description

      Currently, EnumerableDefaults#union buffers all the rows before it returns the first of them

      Pros:
      1) Faster iteration in case enumerable is queried multiple times

      Cons:
      1) The implementation does not work with infinite streams
      2) Keeps memory even after iteration is finished

      An alternative might be something like

        public static <TSource> Enumerable<TSource> union(Enumerable<TSource> source0,
            Enumerable<TSource> source1) {
          Enumerable<TSource> unionAll = concat(source0, source1);
          return new AbstractEnumerable<TSource>() {
            @Override public Enumerator<TSource> enumerator() {
              Set<TSource> set = new HashSet<>();
              return EnumerableDefaults.where(unionAll, set::add).enumerator();
            }
          };
        }
      

      Pros:
      1) Supports infinite streams
      2) In theory, it could reset hashSet after iteration finishes

      Cons:
      1) Slower iteration in case enumerable is queried multiple times (hashSet is rebuilt every time)
      2) concat+abstractenumerable might const CPU cycles

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              vladimirsitnikov Vladimir Sitnikov
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated: