[CALCITE-1731] Rewriting of queries using materialized views with joins and aggregates - ASF JIRA

XML

Word

Printable

JSON

Details

Type: New Feature
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 1.13.0
Component/s: core
Labels:
None

Description

The idea is still to build a rewriting approach similar to:
ftp://ftp.cse.buffalo.edu/users/azhang/disc/SIGMOD/pdf-files/331/202-optimizing.pdf

I tried to build on ~~CALCITE-1389~~ work. However, finally I ended up creating a new alternative rule. The main reason is that I wanted to follow the paper more closely and not rely on triggering rules within the MV rewriting to find whether expressions are equivalent. Instead, we extract information from the query plan and the MVs plans using the new metadata providers proposed in ~~CALCITE-1682~~, and then we use that information to validate and execute the rewriting.

I also implemented new unifying/rewriting logic within the rule, since existing unifying rules for aggregates were assuming that aggregate inputs in the query and the MV needed to be equivalent (same Volcano node). That condition can be relaxed because we verify in the rule, by using the new metadata providers as stated above, that the result for the query is contained within the MV.

I added multiple tests, but any feedback pointing to new tests that could be added to check correctness/coverage is welcome.

Algorithm can trigger multiple rewritings for the same query node. In addition, support for multiple usages of tables in query/MVs is supported.

A few extensions that will follow this issue:

Extend logic to filter relevant MVs for a given query node, so approach is scalable as number of MVs grows.
Produce rewritings using Union operators, e.g., a given query could be partially answered from the MV (year = 2014) and from the query (not(year=2014)). If the MV is stored e.g. in Druid, this rewriting might be beneficial. As with the other rewritings, decision on whether to finally use the rewriting should be cost-based.

Attachments

Issue Links

blocks

HIVE-17432 Enable join and aggregate materialized view rewriting

Closed

breaks

CALCITE-1767 Fix join/aggregate rewriting rule when same table is referenced more than once

Closed

depends upon

CALCITE-1682 New metadata providers for expression column origin and all predicates in plan

Closed

is related to

HIVE-17053 Enable improved Calcite MV-based rewriting rules

Resolved

CALCITE-1791 Support view partial rewriting in join materialized view rewriting

Closed

CALCITE-1795 Extend materialized view rewriting to produce rewritings using Union operators

Closed

CALCITE-1797 Support view partial rewriting in aggregate materialized view rewriting

Closed

relates to

CALCITE-5756 Expand ProjectJoinRemoveRule to support inner join removal by using the foreign-unique constraints

In Progress

supercedes

CALCITE-1389 Add rule to perform rewriting of queries using materialized views with joins

Closed

(2 is related to, 1 relates to, 1 supercedes)

Activity

People

Assignee:: Jesús Camacho Rodríguez

Reporter:: Jesús Camacho Rodríguez

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Dates

Created:: 30/Mar/17 17:56

Updated:: 27/Feb/24 22:24

Resolved:: 26/Apr/17 19:19