Details
-
New Feature
-
Status: In Progress
-
Major
-
Resolution: Unresolved
-
Impala 2.5.0, Impala 2.6.0
-
None
-
CDH
Description
Oracle has "RELY NOVALIDATE" option for constraints.. Could be easier for Hive to start with something like that for PK/FK constraints. So CBO has more information for optimizations. It does not have to actually check if that constraint is relationship is true; it can just "rely" on that constraint.
https://docs.oracle.com/database/121/SQLRF/clauses002.htm#sthref2289
So it would be helpful with join cardinality estimates, and with cases like IMPALA-2929.
https://docs.oracle.com/database/121/DWHSG/schemas.htm#DWHSG9053
"Overview of Constraint States":
- Enforcement
- Validation
- Belief
So FK/PK with "rely novalidate" will have Enforcement&Validate disabled but Belief = RELY as it is possible to do in Oracle and now in Hive (HIVE-13076).
It opens a lot of ways to do additional ways to optimize execution plans.
As exxplined in Tom Kyte's "Metadata matters"
http://www.peoug.org/wp-content/uploads/2009/12/MetadataMatters_PEOUG_Day2009_TKyte.pdf
pp.30 - "Tell us how the tables relate and we can remove them from the plan...".
pp.35 - "Tell us how the tables relate and we have more access paths available...".
Also it might be helpful when Impala is being integrated with Kudu as the latter have to have a PK.
Attachments
Issue Links
- blocks
-
IMPALA-4174 Planner incorrectly estimates cardinality for many to many joins
- Resolved
- is blocked by
-
HIVE-13349 Metastore Changes : API calls for retrieving primary keys and foreign keys information
- Closed
- is depended upon by
-
IMPALA-9971 Use defined table constraints during query planning
- Open
- is related to
-
IMPALA-2929 Add a hint to eliminate self-join for BI systems that don't support nested types yet
- Open
- relates to
-
SPARK-19842 Informational Referential Integrity Constraints Support in Spark
- Open