Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-17626 TPC-DS performance improvements using star-schema heuristics
  3. SPARK-20233

Apply star-join filter heuristics to dynamic programming join enumeration

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • 2.2.0
    • 2.2.0
    • SQL
    • None

    Description

      This JIRA integrates star-join detection with the cost-based optimizer.

      The join enumeration using dynamic programming generates a set of feasible joins. The sub-optimal plans can be eliminated by a sequence of independent, optional filters. The optional filters include heuristics for reducing the search space. For example,

      1. Star-join: Tables in a star schema relationship are planned together since they are assumed to have an optimal execution.
      2. Cartesian products: Cartesian products are deferred as late as possible to avoid large intermediate results (expanding joins, in general).
      3. Composite inners: “Bushy tree” plans are not generated to avoid materializing intermediate result.

      For reference, see “Measuring the Complexity of Join Enumeration in Query Optimization” by Ono et al.

      This JIRA implements the star join filter.

      Attachments

        Activity

          People

            ioana-delaney Ioana Delaney
            ioana-delaney Ioana Delaney
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: