[IMPALA-5976] Remove equivalence classes - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Sub-task
Status: Resolved
Priority: Critical
Resolution: Fixed
Affects Version/s: Impala 2.5.0, Impala 2.6.0, Impala 2.7.0, Impala 2.8.0, Impala 2.9.0, Impala 2.10.0
Fix Version/s: Impala 2.11.0
Component/s: Frontend
Labels:
None

Epic Color:
ghx-label-5

Description

The simplest solution to speed up the equivalence-class computation is to remove the equivalence class altogether.

A few points for justification:

The purpose of the equivalence classes is to provide a condensed version of the value-transfer graph to speed up certain equivalence checks/transformations. All necessary information is already captured in the value-transfer graph, so the high cost of computing the equivalence classes is not justified.
Incomplete/wrong definition of equivalence classes. In today's code, the equivalence class of a slot contains a list of slots where each pair of slots has a value transfer. This definition does not make sense, in particular, for unions which produce value transfers of the form s1 -> s2, s1 -> s3 where according to our definition s1 can only be in the same class as s2 or s3, but not both (because s2 and s3 do not have a value transfer). We could redefine the equivalence class of only containing slot pairs with mutual value transfers, but that information is already efficiently accessible from the value-transfer graph.

The challenge in removing the equivalence classes is to rework all the existing code that relies on them to use the value-transfer graph directly.

One of the tricky functions that needs to change is Analyzer.equivExprs(). An alternative implementation might look like this:

// Returns true if 'other' is equivalent to this. Equivalence means that the two expr trees are equal except for SlotRefs which match if there
// is another SlotRef in the other expr tree at the same position, and the underlying slots have a value transfer. If 'mutualOnly' is true,
// SlotRefs are only considered equivalent if they there is a mutual value transfer.
boolean Expr.isEquiv(Expr other, boolean mutualOnly);

The existing equals() function can probably be refactored into a new function that can be called from both equals() and isEquiv() or something along those lines.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

queryB.sql
18/Oct/17 17:48
155 kB
Tianyi Wang
queryA.sql
18/Oct/17 17:48
155 kB
Tianyi Wang

Issue Links

incorporates

IMPALA-6213 The partitioning compatibility check is wrong in consecutive outer join cases

Resolved

Activity

People

Assignee:: Tianyi Wang

Reporter:: Alexander Behm

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 23/Sep/17 04:54

Updated:: 21/Nov/17 01:41

Resolved:: 20/Nov/17 18:35