Details

    • Sub-task
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 0.14.0
    • CBO
    • None

    Description

      1. For composite predicates smoothen the Selectivity calculation using exponential backoff. Thanks to mmokhtar for this formula.

      Can you change the algorithm to use exponential back-off :
      ndv(pe0) * ndv(pe1) ^(1/2) * ndv(pe2) ^(1/4) * ndv(pe3) ^(1/8)

      Opposed to :

      ndv(pex)*log(ndv(pe1))*log(ndv(pe2))

      If we assume selectivity of 0.7 for each store_sales join then join selectivity can end up being 6.24285E-05 which is too low and eventually results in an un-optimal plan.

      See attached picture.

      2. In case of Fact - Dim joins on the Dim primary key we infer the Join cardinality as a filter on the Fact table:

      join card = rowCount(Fact table) * selectivity(dim table)
      

      Whether a Column is a Key is inferred based on either:

      • table rowCount = column ndv
      • (tbd shortly) table rowCount = (maxVal - minVal)

      Attachments

        1. HIVE-7905.3.patch
          28 kB
          Harish Butani
        2. HIVE-7905.2.patch
          28 kB
          Harish Butani
        3. exp-backoff-vs-log-smoothing
          20 kB
          Harish Butani

        Issue Links

          Activity

            People

              rhbutani Harish Butani
              rhbutani Harish Butani
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: