[SPARK-17791] Join reordering using star schema detection - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Sub-task
Status: Resolved
Priority: Critical
Resolution: Fixed
Affects Version/s: 2.1.0
Fix Version/s: 2.2.0
Component/s: SQL
Labels:
None

Target Version/s:

2.2.0

Description

This JIRA is a sub-task of ~~SPARK-17626~~.

The objective is to provide a consistent performance improvement for star schema queries. Star schema consists of one or more fact tables referencing a number of dimension tables. In general, queries against star schema are expected to run fast because of the established RI constraints among the tables. This design proposes a join reordering based on natural, generally accepted heuristics for star schema queries:

Finds the star join with the largest fact table and places it on the driving arm of the left-deep join. This plan avoids large tables on the inner, and thus favors hash joins.
Applies the most selective dimensions early in the plan to reduce the amount of data flow.

The design description is included in the below attached document.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

StarJoinReordering1214.doc
14/Dec/16 23:11
491 kB
Ioana Delaney

Issue Links

is related to

SPARK-16026 Cost-based Optimizer Framework

Resolved

links to

[Github] Pull Request #15363 (ioana-delaney)

Activity

People

Assignee:: Ioana Delaney

Reporter:: Ioana Delaney

Votes:: 0 Vote for this issue

Watchers:: 14 Start watching this issue

Dates

Created:: 05/Oct/16 22:19

Updated:: 20/Mar/17 08:08

Resolved:: 20/Mar/17 08:08