Uploaded image for project: 'Tajo (Retired)'
  1. Tajo (Retired)
  2. TAJO-1766

Improve the performance of cross join

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 0.11.0
    • distributed query plan
    • None

    Description

      Cross join is one of the very heavy operations. Furthermore, this operator is performed by a single worker in the current implementation. (Please see the implementation of HashPartitioner. If partitionKeyIds is empty, getPartition() always returns a single value.)

      One possible alternative is executing cross join with broadcast join. That is, outer table (smaller one) is always broadcasted, and join is performed by the machine who stores a part of inner table.

      To do so, a new session variable is required to set the broadcast threshold for cross join.

      Attachments

        Activity

          People

            jihoonson Jihoon Son
            jihoonson Jihoon Son
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: