Uploaded image for project: 'Calcite'
  1. Calcite
  2. CALCITE-2909

Optimize Enumerable SemiJoin with lazy computation of innerLookup

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.18.0, 1.19.0
    • Fix Version/s: 1.20.0
    • Component/s: None

      Description

      The implementation of semiJoin in EnumerableDefaults.java is based on two elements: an outer enumerator and an inner lookup. The method "returns elements of outer for which there is a member of inner with a matching key".
      In order to achieve that, the innerLookup is always eagerly computed, even though in some cases it might be not necessary at all: when the outer enumerator returns no element there is no need for the innerLookup. In a worst case scenario, a time-consuming innerLookup computation combined with an empty outer enumerator will lead to an inefficient execution, which could have been easily avoided.
      In order to improve that, it is proposed to delay the computation of the innerLookup until the moment when we are sure that it will be really needed, i.e. when the first outer enumerator item is processed.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                rubenql Ruben Q L
                Reporter:
                rubenql Ruben Q L
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 3h
                  3h