Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-9971 Use defined table constraints during query planning
  3. IMPALA-9972

Use defined referential constraints for join cardinality calculations

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • Frontend
    • None
    • ghx-label-4

    Description

      Currently an estimation technique is used to determine if the join predicates consistent a foreign key -> primary key type of functional dependency. These types of joins are common in "star schemas" and allow for certain query planning optimization.

      The current technique however can produce both false negatives and false positives given the reliance on table stats which can be out of date or incorrect due to the statistical methods used to derive stats. For example higher variability in the error rates of the HyperLogLog algorithm used by stats computation to calculate the number of distinct values for a specific column.

      In case swhere a referential integrity constraint exists and is defined in the table metadata, this information should be used instead of the stats based estimation to determine the type and cardinality of a join.

      Attachments

        Issue Links

          Activity

            People

              superdupershant Shant Hovsepian
              superdupershant Shant Hovsepian
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated: