Details
-
New Feature
-
Status: Resolved
-
Minor
-
Resolution: Later
-
3.1.0
-
None
-
None
Description
Materialized view was introduced in Apache Hive 3.0.0. Currently, Spark Catalyst does not optimize queries against Hive tables using Materialized View the way Apache Calcite does it for Hive. This Jira is to add support for the same.
We have developed it in our internal trunk and would like to open source it. It would consist of 3 major parts:
- Reading MV related Hive Metadata
- Implication Engine which would figure out if an expression exp1 implies another expression exp2 i.e., if exp1 => exp2 is a tautology. This is similar to RexImplication checker in Apache Calcite.
- Catalyst rule to replace tables by it's Materialized view using Implication Engine. For e.g., if MV 'mv' has been created in Hive using query 'select * from foo where x > 10 && x <110' then query 'select * from foo where x > 70 and x < 100' will be transformed into 'select * from mv where x >70 and x < 100'
Note that Implication Engine and Catalyst Rule is generic can be used even when Spark decides to have it's own Materialized View.
Attachments
Issue Links
- links to